Background: Whole-genome sequencing is performed routinely as a means to identify polymorphic genetic loci such as short tandem repeat loci. We have developed a simple tool, called pSTR Finder, which is freely available as a means of identifying putative polymorphic short tandem repeat (STR) loci from data generated from genome-wide sequences. The program performs cross comparisons on the STR sequences generated using the Tandem Repeats Finder based on multiple-genome samples in a FASTA format. These comparisons generate reports listing identical, polymorphic, and different STR loci when comparing two samples. Methods: The web site http://forensic.mc.ntu.edu.tw:9000/PSTRWeb/Default has been developed as a means to identify polymorphic STR loci within complex mass genome sequences. The program was developed to generate a series of user-friendly reports. Results: As proof of concept for the program, four FASTA genome sequence samples of human chromosome X (AC_000155.1, CM000685.1, NC_018934.2, and CM000274.1) were obtained from GenBank and were analyzed for the presence of putative STR regions. The sequences within AC-000155.1 were used as an initial reference sequence from which there were 5443 identical and 4305 polymorphic STR loci identified using a repeat unit of 1-6 and 10 bp as the flanking sequence either side of the putative STR loci. A reliability test was used to compare five FASTA samples, which had sections of DNA sequence removed to mimic partial or fragmented DNA sequences, to determine whether pSTR Finder can efficiently and consistently find identical, polymorphic, and different STR loci. Conclusions: From the mass of DNA sequence data, the project was found to reproducibly identify polymorphic STR loci and generate user-friendly reports detailing the number and location of these potential polymorphic loci. This freely available program was found to be a useful tool to find polymorphic STR within whole-genome sequence data in forensic genetic studies.
- MASSIVE parallel sequencing
- Whole-genome sequences