• Welcome
  • Research
  • CV
  • Publications
  • Software
  • Blogs
  • Travel
  • Contact
  • Links

iAmphioxus

~ It's a long way from Amphioxus

iAmphioxus

Tag Archives: Genome assembly

Using LiftOver to convert genome assembly coordinates

25 Tuesday Jun 2013

Posted by iAmphioxus in Bioinformatics, Blogs

≈ 4 Comments

Tags

Genome assembly, Kent Utilities, LiftOver, UCSC Genome Browser

Suppose we have two different assemblies of the same organism, G1.fasta and G2.fasta, and we want to know the which regions in G2 correspond to our interested regions in G1, what should we do? UCSC LiftOver is a great tool to solve this kind of problems and here is how>

#!/usr/bin/sh

# split G2 into individual chromosome/scaffold (G2.$i.fa) and save them into ./G2_chr direcotry
# I did this step by using a perl script which I wrote, but you can also use UCSC faSplit tool I think.
# Say there are n chromosomes/scaffolds in the G2 assembly, so we will have G2.1.fa, G2.2.fa, ..., G2.n.fa.

# Split the G2 chromosomes/scaffolds into 3K chunks and make lift files
mkdir lift
mkdir split
for i in {1..n}
do
  faSplit size ./G2_chr/G2.$i.fa  3000 ./split/G2.$i.split  -lift=./lift/G2.$i.lft -oneFile
done

# run blat
mkdir psl
for i in {1..n}
do
  blat G1.fasta ./split/G2.$i.split.fa -t=dna -q=dna -tileSize=12 -fastMap -minIdentity=95 -noHead -minScore=100  ./psl/G2.$i.psl
done

# Change coordinates of .psl files to parent coordinate system
mkdir liftup
for i in {1..n}
do
	liftUp -pslQ ./liftup/G2.$i.liftup.psl ./lift/G2.$i.lft warn ./psl/G2.$i.psl
done

# Make chain files
mkdir chain_raw
for i in {1..n}
do
	axtChain -linearGap=medium  -faQ -faT -psl ./liftup/G2.$i.liftup.psl ./G1.fasta ./G2.fasta ./chain_raw/$i.chain
done

# Merge and sort chain files
chainMergeSort ./chain_raw/*.chain | chainSplit chain_split stdin

faSize G1.fasta  -detailed >G1.chr_length.txt
faSize G2.fasta  -detailed >G2.chr_length.txt

# Make alignment nets from chain files
mkdir net
for i in  ./chain_split/*.chain
do
 tag=${i/\.\/chain_split\//}
 chainNet $i ./G1.chr_length.txt ./G2.chr_length.txt ./net/$tag.net /dev/null
done

# Create liftOver chain file
mkdir over
for i in ./chain_split/*.chain
do
  tag=${i/\.\/chain_split\//}
  netChainSubset  ./net/$tag.net $i ./over/$tag.chain
done

cat ./over/*.chain >G1_G2.over.chain

# Make bed file to report converted coordinates. We can give the coordinates of our query regions (based on G1 assembly) in the input.bed file and liftOver will report the converted coordinates in conversion.bed file.
liftOver input.bed  ./G1_G2.over.chain conversion.bed unMapped

References
http://hgwdev.cse.ucsc.edu/~kent/src/unzipped/hg/doc/liftOver.txt
http://genomewiki.ucsc.edu/index.php/Minimal_Steps_For_LiftOver
http://genomewiki.ucsc.edu/index.php/LiftOver_Howto

29.718460 -95.402683

Share this:

  • Email
  • Twitter
  • Facebook

Like this:

Like Loading...

Recent Posts

  • Using anaconda to manage local python environment
  • Local installation of ncbi-blast+ together with the nr and taxonomy database
  • Installing Augustus with manual bamtools installation
  • Custom installation of PacBio’s GenomicConsensus (Quiver)
  • Compiling 64-bit MUMmer

Archives

  • January 2018 (2)
  • May 2017 (2)
  • February 2016 (1)
  • January 2016 (1)
  • August 2015 (1)
  • July 2015 (1)
  • April 2015 (1)
  • June 2013 (1)
  • April 2013 (1)
  • October 2012 (1)
  • May 2012 (1)
  • April 2012 (1)
  • March 2012 (1)
  • February 2012 (2)
  • January 2012 (2)

Categories

  • Blogs (19)
    • Bioinformatics (14)
    • Journal club (3)

2012 annotation arabidopsis Augustus bamtools BioPerl blasr blast C. elegans Centennial CMake conda CPAN developmental biology embryonic genesis Ensembl evolution genewise Genome assembly GenomicConsensus Git GitHub Kent Utilities LiftOver Macports maker maternal effect multicellularity mummer MySQL ncbi new year nr PacBio parallel evolution Perl plant biology Python Quiver resolution Rice rmath SMRTanalysis snoscan Stampy synteny taxonomy tRNAscan-SE UCSC Genome Browser wise2 yeast

Meta

  • Register
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 320 other followers

Blog Stats

  • 34,727 hits
Locations of visitors to this page
Map

Blog at WordPress.com.

loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.
%d bloggers like this: