Skip to content

Commit e452151

Browse files
lukecavabarrettmilot-mirdita
authored andcommitted
Fix bug in IndexTable::printStatistics
The code that computes the top_N elements is bugged - it simply find the topmost elements that is not greater than the current one, and then it overwrites it. The correct behaviour should be instead that once we decide we are better than element j, that element shoulc be moved to j+1, and j+1 should be moved to j+2 and so on. This can be easily fixed with std::move_backward, to ensure the elements are moved forward from the last to the first to be moved. If in the future we'd like to make this more efficient, we could implement this using a heap instead. This is (part of) the log we get when calling createindex on the example data, prior to this fix: Index statistics Entries: 8552346 DB size: 537 MB Avg k-mer size: 0.133630 Top 10 k-mers GQQVAR 190 QLGQRV 110 IHDKNI 105 ALGSGK 105 LLPGKT 102 SGGTLR 84 SGLGRV 75 VGSSST 61 VMHAGS 59 ATADTT 59 And this is after this fix: Index statistics Entries: 8552346 DB size: 537 MB Avg k-mer size: 0.133630 Top 10 k-mers GQQVAR 190 GPGGTL 134 SGQQAI 112 QLGQRV 110 GKTLIA 106 GGKRIA 106 IHDKNI 105 ALGSGK 105 LLPGKT 102 QRRARA 94
1 parent 12a0d9a commit e452151

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

src/prefiltering/IndexTable.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,7 @@ class IndexTable {
276276
continue;
277277
for (size_t j = 0; j < top_N; j++) {
278278
if (topElements[j].first < ((size_t) size)) {
279+
std::move_backward(topElements+j, topElements+top_N-1, topElements+top_N);
279280
topElements[j].first = static_cast<unsigned long>(size);
280281
topElements[j].second = i;
281282
break;

0 commit comments

Comments
 (0)