HiChIP Data Sets

To download one of the data sets, simply use the wget command:

wget https://s3.amazonaws.com/dovetail.pub/HiChIP/fastqs/HiChiP_CTCF_2M_R1.fastq.gz
wget https://s3.amazonaws.com/dovetail.pub/HiChIP/fastqs/HiChiP_CTCF_2M_R2.fastq.gz

For testing purposes, we recommend using the 2M reads data sets, for any other purpose we recommend using the 800M reads data set.

Sequenced (human) libraries:

Library

Link

GM12878 CTCF 2M

GM12878 CTCF (deep sequencing)

GM12878 H3K27Ac (deep sequencing)

GM12878 H3K4me3 (deep sequencing)

Human, hg38, Peak files from ENCODE project

Sample

Target

Accession

URL

Output type

GM12878

CTCF

ENCFF017XLW

https://www.encodeproject.org/files/ENCFF017XLW/@@download/ENCFF017XLW.bed.gz

conservative IDR thresholded peaks

IMR-90

H3K4ac

ENCFF823NUO

https://www.encodeproject.org/files/ENCFF823NUO/@@download/ENCFF823NUO.bed.gz

replicated peaks

GM12878

H3K4me3

ENCFF188SZS

https://www.encodeproject.org/files/ENCFF188SZS/@@download/ENCFF188SZS.bed.gz

replicated peaks

IMR-90

H3K14ac

ENCFF106EAN

https://www.encodeproject.org/files/ENCFF106EAN/@@download/ENCFF106EAN.bed.gz

replicated peaks

GM12878

H3K27ac

ENCFF367KIF

https://www.encodeproject.org/files/ENCFF367KIF/@@download/ENCFF367KIF.bed.gz

replicated peaks

GM12878

H3K27me3

ENCFF153VOQ

https://www.encodeproject.org/files/ENCFF153VOQ/@@download/ENCFF153VOQ.bed.gz

replicated peaks

GM12878

H3K36me3

ENCFF268HMO

https://www.encodeproject.org/files/ENCFF268HMO/@@download/ENCFF268HMO.bed.gz

replicated peaks

GM12878

SMC3

ENCFF534PUK

https://www.encodeproject.org/files/ENCFF534PUK/@@download/ENCFF534PUK.bed.gz

bed

MCF-7

Klf4

ENCFF287QDZ

https://www.encodeproject.org/files/ENCFF287QDZ/@@download/ENCFF287QDZ.bed.gz

conservative IDR thresholded peaks

GM23338

Nanog

ENCFF897LBK

https://www.encodeproject.org/files/ENCFF897LBK/@@download/ENCFF897LBK.bed.gz

conservative IDR thresholded peaks

GM12878

POLR2A

ENCFF794VYB

https://www.encodeproject.org/files/ENCFF794VYB/@@download/ENCFF794VYB.bed.gz

conservative IDR thresholded peaks

Data used for HiChIP Comparative Analysis (Mouse, mm10)

To get a list of all the files generated from the HiChIP Comparative Analysis tutorial, including the required reference genomes, you can use the command:

aws s3 ls s3://dovetail.pub/HiChIP/compare_samples/

Use wget to download any given file, replacing “s3://” with “https://s3.amazonaws.com/”, followed by the remaining path to the file. For example:

wget https://s3.amazonaws.com/dovetail.pub/HiChIP/compare_samples/Reference_Genome/mm10.fa

Data Set

Link

Fastqs (Sample A)

Fastqs (Sample B)

Note: The full dataset, including input files and generated output is ~183Gb (roughly 5h with a network speed of 10Mb/s).