A Novel Algorithm for Finding Interspersed Repeat Regions

在线阅读 下载PDF 导出详情
摘要 TheanalysisofrepeatsintheDNAsequencesisanimportantsubjectinbioinformatics.Inthispaper,weproposeanovelprojection-assemblealgorithmtofindunknowninterspersedrepeatsinDNAsequences.Thealgorithmemploysrandomprojectionalgorithmtoobtainacandidatefragmentset,andexhaustivesearchalgorithmtosearcheachpairoffragmentsfromthecandidatefragmentsettofindpotentiallinkage,andthenassemblethemtogether.Thecomplexityofourprojection-assemblealgorithmisnearlylineartothelengthofthegenomesequence,anditsmemoryusageislimitedbythehardware.Wetestedouralgorithmwithbothsimulateddataandrealbiologydata,andtheresultsshowthatourprojection-assemblealgorithmisefficient.Bymeansofthisalgorithm,wefoundanun-labeledrepeatregionthatoccursfivetimesinEscherichiacoligenome,withitslengthmorethan5,000bp,andamismatchprobabilitylessthan4%.
机构地区 不详
出版日期 2004年03月13日(中国期刊网平台首次上网日期,不代表论文的发表时间)