Frequently Asked Questions

S4TE 2.0 is an online tool for searching for Type IV Secretion System effector proteins (T4Es) in complete bacterial genomes.
S4TE 2.0 searches for 13 protein sequence characteristics and 1 nucleotide sequence characteristic. These characteristics are linked to secretion, subcellular localization in host cells and/or to the function of the proteins.
S4TE 2.0 is the only online tool able to predict Type IV Secretion System effectors. The search for the 14 characteristics gives information on the location and putative function of proteins. The expert mode analysis (S4TE6EM) allows the choice of search characteristics to be applied to a given bacterial genome.
Each module has been set up to have the best specificity to a dedicated dataset.
Enriched DNA motifs are searched for in a window of 100 nucleotides upstream of the start codon, using MEME (Timothy L. Bailey and Charles Elkan, AAAI Press, 1994). Eight consensus motifs linked to T4SS have been identified in different bacteria (PmrA, Cpm, Cpm2, Apm, Apm2, Bopm, Bapm, Hpm).
BLAST 2.2 is used for protein comparisons to look for homologies with known T4Es. S4TE 2.0 compares the database containing all known T4Es with the query proteome and returns all homologs with an expected value (E) < 10-4 .( Altschul S.F. et al., J. Mol. Biol., 1990)
The search for eukaryotic-like domains is done by Pfam-scan.pl (R.D. Finn, NAR, 2014). The motif database contains 58 eukaryotic-like domains previously found in effectors (Meyer DF et al, NAR, 2013).
The search for prokaryotic-like domains is done by Pfam-scan.pl (R.D. Finn, NAR, 2014). The motif database contains 3617 prokaryotic-like domains previously found in effectors. (Meyer DF et al, NAR, 2013).
EPIYA search is a new module of S4TE 2.0. The EPIYA domain is an eukaryotic phosphorylation motif. The EPIYA motif search consists of finding conserved EPIYA or EPIYA-like (EPIYA, ENIYE, NPLYE, EHLYA, TPLYA, EPLYA, ESIYE, EDLYA, EPIYG, EPVYA, VPNYA, EHIYD) motifs (T. Hayashi et al. Cell Mic, 2013). Hypothetical EPIYA motifs are searched for with the motif E-X-X-Y-X.
The search for monopartite NLS has been improved upon in S4TE 2.0 compared to S4TE.We rewrote this module to add more known NLS motifs to the search. A monopartite NLS consists of [KR]-[KR]-[KR-][KR]-[KR], X-K-[KR]-[KRP]-[KR]-X, X-R-K-[KRP]-[KR]-X, X-R-K-X-[KR]-[KRP], X-K-[KR]-[KR]-X-[KRP], X-R-K-[KR]-X-[KRP], X-K-[KR]-X-[KR]-X-X, X-R-K-X-[KR]-X-X, X-K-[KR]-[KR]-X-X-X and X-R-K-[KR]-X-X-X. Bipartite NLSs are also searched for with the following motif: (K-[KR]-X(6,20)-[KR]-[KR]-X-[KR]). (szurek et al. 2002, Desland et al. 2003, Dean et al. 2011, Dinkel et al. 2012)
To predict MLSs in S4TE 2.0, we use TargetP 1.1. The output is ranked based on reliability class (RC). RC is an indicator that ranges from 1 to 5 where 1 indicates the best prediction. In S4TE 2.0 only predictions with RC >= 2 appear in the results.
Prenylation is a post-translational modification that is required for protein stability and for the binding to membranes. Prenylation involves the covalent addition of a 15-carbon farnesyl or a 20-carbon geranylgeranyl isoprenoid group to a Cys residue within the conserved C-terminal CaaX motif (in which 'a' represents an aliphatic residue and 'X' is one of the 22 amino acids). (Meyer DF et al, NAR, 2013).
Coiled coils are structural motifs in proteins in which at least two α-helices are coiled together. To search for coiled-coil domains, we use pepcoil software from the Emboss package. (Meyer DF et al, NAR, 2013).
This feature searches for alkaline amino acids (HRK) in the 25 C-terminal amino acids. (Meyer DF et al, NAR, 2013).
Charge is calculated by summing the positively charged amino acids (HRK) and by subtracting the number of negatively charged amino acids (ED) and the negative C-terminal charge (COO-) in the 25 C-terminal amino acids. (Meyer DF et al, NAR, 2013).
This feature looks for a hydrophobic residue at the third and fourth C-terminal positions. (Meyer DF et al, NAR, 2013).
This feature screens a proteome to find proteins which have a global hydropathy score of <=-200 according to Kyte-Doolittle scale.( Kyte J, Doolittle RF, J. Mol. Biol., 1982)
The E-block motif is 10 amino acids containing at least 3 or more glutamates (E). E-block is searched for in a window of 22 amino acids between the -4 and -26 C-terminal positions. (L.Huang et al. Cell Mic, 2010)
Each module has its own weighting in S4TE 2.0 search (see the table below). And S4TE 2.0 modules have been configured to have the best positive prediction value on Legionella pneumophila pneumophila str. Philadelphia I.

The threshold was set after the weighting of each module. The threshold was defined in order to have the best performance for S4TE 2.0 with these weightings. The threshold was chosen by examining the Sensitivity (Se), Specificity (Sp), Positive Predictive Value (PPV) and Negative Predictive Value (NPV)of different thresholds on the test dataset. S4TE 2.0 sets the threshold cutoff at 72. It helps to find most of L. pneumophila effectors. This threshold combined with weightings led to the correct prediction (True positives) of 282 of the 286 effectors of L. pneumophila (Se=98%, PPV=74%)and 96 incorrect predictions (false positives) (Sp=96%, NPV=99%).
You can import a complete genome in .gbk (genbank) format (Unassembled genomes don't work). In order to be compatible with the program the gbk file must contain the following lines (exemple with Coxiella burnetii RSA 493) :
- DEFINITION Coxiella burnetii RSA 493: NC_002971
- /db_xref="taxon:227377"
and for each gene :
- CDS 140..1495
- /locus_tag="CBU_0001"
- /protein_id="NP_819057"
- /product="chromosomal replication initiator protein DnaA"
- /translation="MSLPTSLW[...]RILSG"
After uploading of your data into S4TE 2.0 you will be able to use them in S4TE-EM and S4TE-CG.

If your genome upload does not work, please contact support by sending your genbank file to the following email address: sate.cirad.fr
Your data will be private and will remainon the server for 3 months after your analysis.
From the S4TE 2.0 main page, choose your genome of choice and click the "Run S4TE 2.0" button.S4TE 2.0 requires pop-ups be allowed for the result charts. Settings for S4TE 2.0 can be adjusted in S4TE-EM.
Once the S4TE 2.0 analysis is completed, the user can explore the S4TE 2.0 results.These results are composed of two different web-pages. The first page shows all S4TE 2.0 results for the selected genome. The user can find a graph showing the distribution of S4TE 2.0 results (top), an overview of S4TE 2.0 results for all selected genome proteins (middle) and, at the bottom, a list of S4TE 2.0 detailed results. By clicking on “See more”, for each protein, the user can see details of all selected characteristics. The user can go back by clicking on “return to genome”. The second result page contains two graphs. The first graph shows the distribution of candidate T4Es according to local gene density. All S4TE 2.0 putative effectors (score>=90) are plotted on this density graph. The second graph shows a circular representation of the query genome with a graph of G+C content. This circular graph shows the position of all proteins (black), putative S4TE 2.0 effectors (pink), proven T4Es (turquoise) and S4TE 2.0 putative effectors in genomic regions with low and hight G+C content (yellow and blue respectively).
In S4TE-EM, the user can change the parameters of S4TE 2.0 (weighting for each module).You can also disable some modules and run S4TE 2.0 with few modules as desired.In this mode S4TE-EM can be viewed as 14 independent programs to search for protein features in bacterial genomes.
The weighting of each module can be changed using the silder between the lowest significant weighting (PPV>=0.5) and the maximum calculated for each modules ((default weighting)+14*5)(see weightings below).

Users can disable one (or more) module(s) in the work-flow by unchecking the corresponding box(es).We would like to stress that in S4TE-EM if one module is disabled, the prediction of putative T4Es is meaningless and the threshold is removed.
S4TE-CG is a new tool developed to compare different repertoires of putative T4Es (effectors) identified by S4TE 2.0. The corresponding algorithm is represented in the graph below.

The user can compare up to 4 genomes at the same time. S4TE 2.0 results from selected genomes (effectomes) are compared with Blastp 2.2 to finc homologous proteins in each effectome.
S4TE-CG successively compares all effectomes in a pairwise manner.The overlaps between the effectomes of each genome are calculated and the final results are plotted on a Venn diagram and listed in an interactive table.
S4TE-CG results consitst of a Venn diagram and a table to summarize the genes of each intersection between th effectomes.
Each subset has its own corresponding color in the Venn diagram to facilitate the reading of the results.
S4TE 2.0 is a website consisting of 5 pages (home, work pages and documentation page). On the work pages, a description of the current page is shown on the left panel, the work space on the middle panel and the user account on the right panel.
At the bottom of the page, the user can find the last 5 pieces of news and information regarding S4TE 2.0. By clicking on “all news” the user can browse previously published information.
The user account keeps the history of your work and analyses on the website. Data will be kept for 3 months in your account. This account allows the user to import a genome into S4TE 2.0 for private use. The user can also ask to add an effector to the S4TE 2.0 public database to improve future searches for all genomes.
Registering for S4TE 2.0 will make you part of a wonderful community of researchers from all around the world, working tirelessly to complete their quests in finding new T4 effectors! Be a part of the fellowship to enrich this tool and to make new discoveries.

To reassure the most paranoid of users, all of the input data in S4TE 2.0 are strictly confidential and will never be used for the public tool. However if the mere idea of giving your email address is a nightmare, you can also download S4TE 2.0 package on S4TE-Doc webpage. This is a standalone version of S4TE 2.0 without the graphical interface or results.
The newsletter is the simplest way to be informed of S4TE 2.0 improvements. When a new genome or proven T4 effector is added to the S4TE 2.0 database or when S4TE 2.0 is modified an automatic email will be send to you.