fqtools is a software suite for fast processing of FASTQ files. Various file manipulations are supported. See below for a full list of the subcommands available and a brief description of their purpose. Most of the individual subcommands will take either a single file or a pair of files as input. If no input file is specified, fqtools will attempt to read data from stdin. In this case, it is advisabe to specify the format of the data provided. For subcommands that generate FASTQ data, either a single file or a pair of files will be generated. If no -o argument is provided, single files will be writted to stdout.
If you use fqtools in pblished work, please can you include a reference to my Bioinformatics paper:
- Droop, A. P. (2016). fqtools: An efficient software suite for modern FASTQ file manipulation. Bioinformatics (Oxford, England). [DOI:10.1093/bioinformatics/btw088]
fqtools requires building against both the zlib and htslib libraries:
zlibis required for processing compressed (.gz) data. The code relies on several recent zlib file IO functions, so must be a version >= 1.2.3.5.htslibis required for reading BAM files. If htslib is not installed, download and compilehtslib. Then, alter theHTSDIRpath in thefqtoolsMakefile to point to the htslib source directory.
If ZLib is already installed, building can be performed similar to the following:
git clone https://github.com/alastair-droop/fqtools
cd fqtools/
git clone https://github.com/samtools/htslib
cd htslib/
autoheader
autoconf
./configure
make
make install
cd ..
make
You might need to run the make install as sudo make install. The htslib library must be installed into a location that the built fqtools program can find (as fqtools executable is dynamically linked to the htslib library). So, if you can not (or do not want to) install HTSlib, you must add the location of the libhts.so file to your LD_LIBRARY_PATH variable.
fqtools is released under the GNU General Public License version 3.
The fqtools suite contains the following subcommands:
viewView FASTQ filesheadView the first reads in FASTQ filescountCount FASTQ file readsheaderView FASTQ file header datasequenceView FASTQ file sequence dataqualityView FASTQ file quality dataheader2View FASTQ file secondary header datafastaConvert FASTQ files to FASTA formatbasetabTabulate FASTQ base frequenciesqualtabTabulate FASTQ quality character frequenciestypeAttempt to guess the FASTQ quality encoding typevalidateValidate FASTQ filesfindFind FASTQ reads containing specific sequencestrimTrim reads in a FASTQ filequalmapTranslate quality values using a mapping file
Each subcommand has its own set of arguments. The global arguments are:
-hShow this help message and exit.-vShow the program version and exit.-dAllow DNA sequence bases (ACGTN)-rAllow RNA sequence bases (ACGUN)-aAllow ambiguous sequence bases (RYKMSWBDHV)-mAllow mask sequence base (X)-uAllow uppercase sequence bases-lAllow lowercase sequence bases-p CHRSet the pair replacement character (default "%")-b BUFSIZESet the input buffer size-B BUFSIZESet the output buffer size-q QUALTYPESet the quality score encoding-f FORMATSet the input file format-F FORMATSet the output file format-iRead interleaved input file pairs-IWrite interleaved output file pairs
CHR
This character will be replaced by the pair value when writing paired files.
BUFSIZE
Possible suffixes are [bkMG]. If no suffix is given, value is in bytes.
QUALTYPE
uDo not assume specifc quality score encodingsInterpret quality scores as Sanger encodedoInterpret quality scores as Solexa encodediInterpret quality scores as Illumina encoded
FORMAT
Funcompressed FASTQ format (.fastq)fcompressed FASTQ format (.fastq.gz)bunaligned BAM format (.bam)uattempt to infer format from file extension, (default .fastq.gz)