Parsing and manipulating long and/or multiple protein or gene sequences can be a challenging process for experimental biologists and microbiologists lacking prior knowledge of bioinformatics and programming. Here we present a simple, easy, user-friendly and versatile tool to parse, manipulate and search within large datasets of long and multiple protein or gene sequences. The Shetti tool can be used to search for a sequence, species, protein/gene or pattern/motif. Moreover, it can also be used to construct a universal consensus or molecular signatures for proteins based on their physical characteristics. Shetti is an efficient and fast tool that can deal with large sets of long sequences efficiently. Shetti parses UniProt Knowledgebase and NCBI GenBank flat files and visualizes them as a table.

  • This is an open-access article distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the source is credited.


An erratum has been published for this content:
Shetti, a simple tool to parse, manipulate and search large dataset of sequences

Article metrics loading...

Loading full text...

Full text loading...



  1. Anzaldi L. J., Muñoz-Fernández D., Erill I. 2012; BioWord: a sequence manipulation suite for Microsoft Word. BMC Bioinformatics 13:124 [View Article][PubMed]
    [Google Scholar]
  2. Benson D. A., Cavanaugh M., Clark K., Karsch-Mizrachi I., Lipman D. J., Ostell J., Sayers E. W. 2013; GenBank. Nucleic Acids Res 41:D1D36–D42 [View Article][PubMed]
    [Google Scholar]
  3. Mi T., Merlin J. C., Deverasetty S., Gryk M. R., Bill T. J., Brooks A. W., Lee L. Y., Rathnayake V., Ross C. A., other authors. 2012; Minimotif Miner 3.0: database expansion and significantly improved reduction of false-positive predictions from consensus sequences. Nucleic Acids Res 40:D1D252–D260 [View Article][PubMed]
    [Google Scholar]
  4. Okonechnikov K., Golosova O., Fursov M., UGENE team. 2012; Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics 28:1166–1167 [View Article][PubMed]
    [Google Scholar]
  5. Olsen L. R., Kudahl U. J., Simon C., Sun J., Schönbach C., Reinherz E. L., Zhang G. L., Brusic V. 2013; BlockLogo: visualization of peptide and sequence motif conservation. J Immunol Methods 400-401:37–44 [View Article][PubMed]
    [Google Scholar]
  6. Sobhy H., Colson P. 2012; Gemi: PCR primers prediction from multiple alignments. Comp Funct Genomics 2012:783138 [View Article][PubMed]
    [Google Scholar]
  7. Xia X. 2013; DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 30:1720–1728 [View Article][PubMed]
    [Google Scholar]

Data & Media loading...


Supplementary Data


Most cited this month Most Cited RSS feed

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error