đŸ•ˇī¸ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 79 (from laksa044)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

â„šī¸ Skipped - page is already crawled

📄
INDEXABLE
✅
CRAWLED
5 days ago
🤖
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH0.2 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Last Crawled2026-04-02 00:06:59 (5 days ago)
First Indexed2017-03-23 11:56:36 (9 years ago)
HTTP Status Code200
Meta TitleBabraham Bioinformatics - FastQC A Quality Control tool for High Throughput Sequence Data
Meta Descriptionnull
Meta Canonicalnull
Boilerpipe Text
Function A quality control tool for high throughput sequence data. Language Java Requirements A suitable Java Runtime Environment The Picard BAM/SAM Libraries (included in download) Code Maturity Stable. Mature code, but feedback is appreciated. Code Released Yes, under GPL v3 or later . Initial Contact Simon Andrews Download Now View our tutorial video FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis. The main functions of FastQC are Import of data from BAM, SAM or FastQ files (any variant) Providing a quick overview to tell you in which areas there may be problems Summary graphs and tables to quickly assess your data Export of results to an HTML based permanent report Offline operation to allow automated generation of reports without running the interactive application Documentation A copy of the FastQC documentation is available for you to try before you buy (well download..). Example Reports Good Illumina Data Bad Illumina Data Adapter dimer contaminated run Small RNA with read-through adapter Reduced Representation BS-Seq PacBio 454 Changelog 01-03-23: Version 0.12.0 released Fix a bug in file type detection on OSX 01-03-23: Version 0.12.0 released Add total base count to basic stats Add dup_length option to set the level of truncation for duplicate finding Make default truncation length always 50bp Removed the deduplicated duplication line from the duplicate plot Improve memory handling and add a --memory option to the command line Move BAM parsing to htsjdk Make colours colourblind friendly Generate SVG versions of graphs, and add a --svg option to use these in the report Add line numbers to parsing errors Change the default adapter sequences to search 08-01-19: Version 0.11.9 released Fixed a bug when analysing empty files Added support for multi-read fast5 files Fixed a corner case bug in adapter detection Bundled a JRE with the OSX build so you don't have to install it Fixed a hang if the program runs out of memory 04-10-18: Version 0.11.8 released Fixed a performance bug in highly duplicated sequences Changed the behaviour of the sequence length module when run with --nogroup Other minor bug fixes 10-01-18: Version 0.11.7 released Fixed a crash if the first sequence in a file was shorter than 12bp 21-12-17: Version 0.11.6 released Disabled the Kmer plot by default Fixed a bug when long custom adapters were being used Changed the tile number cutoff to accommodate the novaseq Fixed various format changes in nanopore data from ONT Added new Clontech sequences to the contaminant list Added a --min-length option to remove short sequences Added an option to specify the output name of data streamed into the program 08-03-16: Version 0.11.5 released Fixed the smallRNA adapter sequence so that abundance isn't under-represented in the adapter content plot Fixed a bug in the warn / error code for the per-base sequence content plot Fixed a typo in the documentation for the duplication plot 09-10-15: Version 0.11.4 released Changed the OSX launcher to not rely on the internal JVM framework, but use any command line java which is found Fixed a typo in one of the adapter sequences Fixed a bug which meant that some file extensions weren't removed from report names in non-interactive mode Made the per-tile module not collect any stats if it's disabled in limits.txt Fixed a bug in the calculation of duplication for highly duplicated, ordered files with very small numbers of sequences Fixed an incorrect error flag in the per-base quality module where there were less than 100 observations in a read group 25-3-15: Version 0.11.3 released Fixed a bug when disabling the per-tile plot from limits.txt Fixed a bug which caused the program to continue when processing of multiple files was actually complete Fixed a bug which meant format selection in the interactive application didn't work Added checks for mis-itentifying tile numbers in confusing sample ids Added the SOLID smallRNA adapter to the standard search set Fixed a bug when extracting casava names from uncompressed fastq files Added support for processing files of Oxford Nanopore reads 6-6-14: Version 0.11.2 released Fixed incorrect warn/fail defaults for per-seq quality plot Fixed memory leaks in Kmer and per-seq quality modules Added an option to use a custom limits file Fixed a bug in the naming of the folder inside the zip output file Fixed a bug in the --extract option 2-6-14: Version 0.11.1 released Added configurable warn/fail thresholds for all modules Allow modules to be selectively turned off Added a per-tile quality plot for Illumina libraries Added an adapter content plot Improved the duplication plot Improved the Kmer module Used embedded graphics in the HTML output so you can distribute a single file Added the ability to read data from stdin Changed how base grouping works to better accommodate long reads Dropped support for Solexa64 format (NB not Phred 64 which is still supported) 3-5-12: Version 0.10.1 released Added a workround to allow the analysis of concatenated gzipped files Fixed a bug when FastQC was installed in a path containing characters needing to be escaped in a URL Added an option to specify the location of the java interpreter on the command line 9-9-11: Version 0.10.0 released Added a Casava mode to sanely process the multiple fastq files produced by the latest illumina pipeline Fixed a bug in Kmer analysis which missed of the last possible Kmer in each sequence Fixed a classpath bug if using the wrapper script under windows 31-8-11: Version 0.9.6 released Fixed a crash in libraries where every sequence ended in poly-N Fixed the launch wrapper to set the classpath correctly on OSX 16-8-11: Version 0.9.5 released Fixed a bug in text output for the per-base sequence content module Made progress reporting absolute, and not approximate Added a print CSS style so reports are printable again 13-7-11: Version 0.9.4 released Improved the error reporting for failed files in the offline application 16-6-11: Version 0.9.3 released Added support for bzip2 compressed fastq files Added new CSS theme for HTML reports, contributed by Phil Ewels 16-5-11: Version 0.9.2 released Fixed a bug where grouped base numbers weren't reported in the per-base quality text report Fixed a crash in the Kmer analysis when analysing small files 30-3-11: Version 0.9.1 released Added --quiet and --nogroup options to command line Added encoding type to the basic stats Added detection of Illumina <1.3 1.3 1.5 and 1.9 encodings 10-2-11: Version 0.9.0 released Added support for very long reads (esp 454 and PacBio) Duplication detection now uses only the first 50bp of each read 21-1-11: Version 0.8.0 released Made all graphs easier to interpret Added an option to analyse only mapped sequences from a BAM/SAM file Added an option to analyse two or more files in parallel 24-11-10: Version 0.7.2 released Fixed bug when analysing libraries with no unique sequences Added an option to specify a custom contaminant list on the command line 24-11-10: Version 0.7.1 released Improved the command line interface with proper options and error handling Added an option to force the file format where guessing from the filename doesn't work 27-10-10: Version 0.7.0 released Added a Kmer enrichment analysis to find non-aligned enriched sequences Cleaned up axis labels on all graphs 27-10-10: Version 0.6.1 released Fixed a bug which caused some sequences and qualities from BAM/SAM files to be reversed 18-10-10: Version 0.6.0 released Sequences can now be read from SAM/BAM format files Added smoother lines to the graphs 29-09-10: Version 0.5.1 released Fixed a formatting bug in the text output Fixed the %GC plot to work well with reads over 100bp Improved the fitting of the modelled curve to the %GC plot Added more illumina oligos to the contaminants file 16-09-10: Version 0.5.0 released Improved the fitting of the normal distribution to %GC plot Calculated the total duplicated sequence % in the duplicate sequence module Added pass/fail/warn icons next to each section of the HTML report Put Icons and Images into subfolders in the HTML report 30-07-10: Version 0.4.3 released Fixed the reporting of sequence counts in the Basic Stats module Added a warning before overwriting reports in the interactive application 26-07-10: Version 0.4.2 released Fixed y-axis scale on per-base quality plot Added fail / warn checks to modules which lacked them and improved existing checks Added a modelled distribtion to the per-sequence GC plot Scale the width of report graphs for long sequence reads 24-06-10: Version 0.4.1 released Changed the duplicate module to reduce memory usage for long sequences Changed the way duplicate levels are counted to be more realistic 18-06-10: Version 0.4 released Added a sequence duplication level module Added a lauch wrapper for easier use from the commandline Added full machine parsable output for integration into pipelines 28-05-10: Version 0.3.1 released Fixed a bug where invalid template files caused a crash Non-interactive use now correctly reports progress for all files, not just the first one Added some missing documentation 13-05-10: Version 0.3 released Added support for gzip compressed fastq files Added identification of overrepresented sequences Improved colorspace support Added an option to save non-interactive reports to a specific directory 06-05-10: Version 0.2 released Added support for colorspace fastq files Added templating support to allow customisation of HTML reports Unzipped non-interactive reports by default, and added an option to turn this off Added easily computer readable summary file to reports 28-04-10: Version 0.1.1 released Fixed a bug which prevented non-interactive use on a headless system 26-04-10: Version 0.1 released Initial set of 9 modules Interactive and offline operation functional
Markdown
![Babraham Bioinformatics](https://www.bioinformatics.babraham.ac.uk/images/babraham_bioinformatics.gif) [About](https://www.bioinformatics.babraham.ac.uk/index.html) \| [People](https://www.bioinformatics.babraham.ac.uk/people.html) \| [Services](https://www.bioinformatics.babraham.ac.uk/services.html) \| [Projects](https://www.bioinformatics.babraham.ac.uk/projects/index.html) \| [Training](https://www.bioinformatics.babraham.ac.uk/training.html) \| [Publications](https://www.bioinformatics.babraham.ac.uk/publications.html) ## FastQC | | | |---|---| | Function | A quality control tool for high throughput sequence data. | | Language | Java | | Requirements | A [suitable Java Runtime Environment](https://adoptopenjdk.net/) The [Picard](http://picard.sourceforge.net/) BAM/SAM Libraries (included in download) | | Code Maturity | Stable. Mature code, but feedback is appreciated. | | Code Released | Yes, under [GPL v3 or later](http://www.gnu.org/copyleft/gpl.html). | | Initial Contact | [Simon Andrews](https://www.bioinformatics.babraham.ac.uk/people.html#simon) | | [Download Now](https://www.bioinformatics.babraham.ac.uk/projects/download.html#fastqc) | | ![](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc.png) [View our tutorial video](http://www.youtube.com/watch?v=bz93ReOv87Y) FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis. The main functions of FastQC are - Import of data from BAM, SAM or FastQ files (any variant) - Providing a quick overview to tell you in which areas there may be problems - Summary graphs and tables to quickly assess your data - Export of results to an HTML based permanent report - Offline operation to allow automated generation of reports without running the interactive application ## Documentation A [copy of the FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/) documentation is available for you to try before you buy (well download..). ## Example Reports - [Good Illumina Data](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/good_sequence_short_fastqc.html) - [Bad Illumina Data](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/bad_sequence_fastqc.html) - [Adapter dimer contaminated run](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/RNA-Seq_fastqc.html) - [Small RNA with read-through adapter](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/small_rna_fastqc.html) - [Reduced Representation BS-Seq](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/RRBS_fastqc.html) - [PacBio](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/pacbio_srr075104_fastqc.html) - [454](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/454_SRR073599_fastqc.html) ## Changelog - 01-03-23: Version 0.12.0 released - - Fix a bug in file type detection on OSX - 01-03-23: Version 0.12.0 released - - Add total base count to basic stats - Add dup\_length option to set the level of truncation for duplicate finding - Make default truncation length always 50bp - Removed the deduplicated duplication line from the duplicate plot - Improve memory handling and add a --memory option to the command line - Move BAM parsing to htsjdk - Make colours colourblind friendly - Generate SVG versions of graphs, and add a --svg option to use these in the report - Add line numbers to parsing errors - Change the default adapter sequences to search - 08-01-19: Version 0.11.9 released - - Fixed a bug when analysing empty files - Added support for multi-read fast5 files - Fixed a corner case bug in adapter detection - Bundled a JRE with the OSX build so you don't have to install it - Fixed a hang if the program runs out of memory - 04-10-18: Version 0.11.8 released - - Fixed a performance bug in highly duplicated sequences - Changed the behaviour of the sequence length module when run with --nogroup - Other minor bug fixes - 10-01-18: Version 0.11.7 released - - Fixed a crash if the first sequence in a file was shorter than 12bp - 21-12-17: Version 0.11.6 released - - Disabled the Kmer plot by default - Fixed a bug when long custom adapters were being used - Changed the tile number cutoff to accommodate the novaseq - Fixed various format changes in nanopore data from ONT - Added new Clontech sequences to the contaminant list - Added a --min-length option to remove short sequences - Added an option to specify the output name of data streamed into the program - 08-03-16: Version 0.11.5 released - - Fixed the smallRNA adapter sequence so that abundance isn't under-represented in the adapter content plot - Fixed a bug in the warn / error code for the per-base sequence content plot - Fixed a typo in the documentation for the duplication plot - 09-10-15: Version 0.11.4 released - - Changed the OSX launcher to not rely on the internal JVM framework, but use any command line java which is found - Fixed a typo in one of the adapter sequences - Fixed a bug which meant that some file extensions weren't removed from report names in non-interactive mode - Made the per-tile module not collect any stats if it's disabled in limits.txt - Fixed a bug in the calculation of duplication for highly duplicated, ordered files with very small numbers of sequences - Fixed an incorrect error flag in the per-base quality module where there were less than 100 observations in a read group - 25-3-15: Version 0.11.3 released - - Fixed a bug when disabling the per-tile plot from limits.txt - Fixed a bug which caused the program to continue when processing of multiple files was actually complete - Fixed a bug which meant format selection in the interactive application didn't work - Added checks for mis-itentifying tile numbers in confusing sample ids - Added the SOLID smallRNA adapter to the standard search set - Fixed a bug when extracting casava names from uncompressed fastq files - Added support for processing files of Oxford Nanopore reads - 6-6-14: Version 0.11.2 released - - Fixed incorrect warn/fail defaults for per-seq quality plot - Fixed memory leaks in Kmer and per-seq quality modules - Added an option to use a custom limits file - Fixed a bug in the naming of the folder inside the zip output file - Fixed a bug in the --extract option - 2-6-14: Version 0.11.1 released - - Added configurable warn/fail thresholds for all modules - Allow modules to be selectively turned off - Added a per-tile quality plot for Illumina libraries - Added an adapter content plot - Improved the duplication plot - Improved the Kmer module - Used embedded graphics in the HTML output so you can distribute a single file - Added the ability to read data from stdin - Changed how base grouping works to better accommodate long reads - Dropped support for Solexa64 format (NB **not** Phred 64 which is still supported) - 3-5-12: Version 0.10.1 released - - Added a workround to allow the analysis of concatenated gzipped files - Fixed a bug when FastQC was installed in a path containing characters needing to be escaped in a URL - Added an option to specify the location of the java interpreter on the command line - 9-9-11: Version 0.10.0 released - - Added a Casava mode to sanely process the multiple fastq files produced by the latest illumina pipeline - Fixed a bug in Kmer analysis which missed of the last possible Kmer in each sequence - Fixed a classpath bug if using the wrapper script under windows - 31-8-11: Version 0.9.6 released - - Fixed a crash in libraries where every sequence ended in poly-N - Fixed the launch wrapper to set the classpath correctly on OSX - 16-8-11: Version 0.9.5 released - - Fixed a bug in text output for the per-base sequence content module - Made progress reporting absolute, and not approximate - Added a print CSS style so reports are printable again - 13-7-11: Version 0.9.4 released - - Improved the error reporting for failed files in the offline application - 16-6-11: Version 0.9.3 released - - Added support for bzip2 compressed fastq files - Added new CSS theme for HTML reports, contributed by Phil Ewels - 16-5-11: Version 0.9.2 released - - Fixed a bug where grouped base numbers weren't reported in the per-base quality text report - Fixed a crash in the Kmer analysis when analysing small files - 30-3-11: Version 0.9.1 released - - Added --quiet and --nogroup options to command line - Added encoding type to the basic stats - Added detection of Illumina \<1.3 1.3 1.5 and 1.9 encodings - 10-2-11: Version 0.9.0 released - - Added support for very long reads (esp 454 and PacBio) - Duplication detection now uses only the first 50bp of each read - 21-1-11: Version 0.8.0 released - - Made all graphs easier to interpret - Added an option to analyse only mapped sequences from a BAM/SAM file - Added an option to analyse two or more files in parallel - 24-11-10: Version 0.7.2 released - - Fixed bug when analysing libraries with no unique sequences - Added an option to specify a custom contaminant list on the command line - 24-11-10: Version 0.7.1 released - - Improved the command line interface with proper options and error handling - Added an option to force the file format where guessing from the filename doesn't work - 27-10-10: Version 0.7.0 released - - Added a Kmer enrichment analysis to find non-aligned enriched sequences - Cleaned up axis labels on all graphs - 27-10-10: Version 0.6.1 released - - Fixed a bug which caused some sequences and qualities from BAM/SAM files to be reversed - 18-10-10: Version 0.6.0 released - - Sequences can now be read from SAM/BAM format files - Added smoother lines to the graphs - 29-09-10: Version 0.5.1 released - - Fixed a formatting bug in the text output - Fixed the %GC plot to work well with reads over 100bp - Improved the fitting of the modelled curve to the %GC plot - Added more illumina oligos to the contaminants file - 16-09-10: Version 0.5.0 released - - Improved the fitting of the normal distribution to %GC plot - Calculated the total duplicated sequence % in the duplicate sequence module - Added pass/fail/warn icons next to each section of the HTML report - Put Icons and Images into subfolders in the HTML report - 30-07-10: Version 0.4.3 released - - Fixed the reporting of sequence counts in the Basic Stats module - Added a warning before overwriting reports in the interactive application - 26-07-10: Version 0.4.2 released - - Fixed y-axis scale on per-base quality plot - Added fail / warn checks to modules which lacked them and improved existing checks - Added a modelled distribtion to the per-sequence GC plot - Scale the width of report graphs for long sequence reads - 24-06-10: Version 0.4.1 released - - Changed the duplicate module to reduce memory usage for long sequences - Changed the way duplicate levels are counted to be more realistic - 18-06-10: Version 0.4 released - - Added a sequence duplication level module - Added a lauch wrapper for easier use from the commandline - Added full machine parsable output for integration into pipelines - 28-05-10: Version 0.3.1 released - - Fixed a bug where invalid template files caused a crash - Non-interactive use now correctly reports progress for all files, not just the first one - Added some missing documentation - 13-05-10: Version 0.3 released - - Added support for gzip compressed fastq files - Added identification of overrepresented sequences - Improved colorspace support - Added an option to save non-interactive reports to a specific directory - 06-05-10: Version 0.2 released - - Added support for colorspace fastq files - Added templating support to allow customisation of HTML reports - Unzipped non-interactive reports by default, and added an option to turn this off - Added easily computer readable summary file to reports - 28-04-10: Version 0.1.1 released - - Fixed a bug which prevented non-interactive use on a headless system - 26-04-10: Version 0.1 released - - Initial set of 9 modules - Interactive and offline operation functional Having problems with the site? Please [let us know](mailto:simon.andrews@babraham.ac.uk)
Readable Markdownnull
Shard79 (laksa)
Root Hash9785297403088939879
Unparsed URLuk,ac,babraham!bioinformatics,www,/projects/fastqc/ s443