This project contains the datasets and code used in our study:
"A comprehensive analysis of software development and distribution practices followed by 430 RNA-seq tools developed from 2008 to 2024."
Sharma, S., et al.(2025) Robust software development practices improve citations of RNA-seq tools. Biopolymers and Cell. DOI: 10.7124/bc.000AFE
We compiled publications describing novel RNA-seq tools from Google Scholar, PubMed, and Oxford Academic.
Our approach for extracting and verifying software links is described in the Methods section of the manuscript. Timeout links were manually verified.
The dataset is provided as a CSV file and contains the following fields:
- Name of the tool
- Year of publication
- Software interface utilized
- Package manager availability
- Docker containerization support
- Multithreading capacity
- User guide availability
- Sample dataset availability
- Archival stability
- Number of releases or updates
- Benchmarking practices
- License type
- Citation count
To reproduce our figures and results, we provide Google Colab Notebooks:
All figures and analyses can be reproduced using the accompanying code and data.
This repository is under MIT license.
Please contact us with comments, suggestions, or questions: