Iowa State University now has a PacBio machine and there are going to be lots of questions about how to analyze the data. So I thought this would be a good first post for the blog.
Everything that I am documenting here comes from the github repository located here and by reading the power point documentation provided.
I am loading an older version of R since many of the required packages are not available yet using install.packages.
These libraries are required prerequesites to run stsPlots.R for QC of PacBio runs.
- module load 3.1.3
Change to the directory that you plan on doing the QC
Now that we have it installed let's grab the stsPlots.R functions from github
- wget https://github.com/PacificBiosciences/stsPlots/raw/master/stsPlots.R
- #Also create softlinks to all sts files, I take advantage of the xargs command to make this really easy.
- find ../../protein/ -name "*sts.csv" | xargs -I xx ln -s xx
Now that we have all the files we need let's look at the QC.
note that folder names require a double slash for every slash if you are in the folder when you start the R from command line you can execute it as follows.
All plots can now be found in this folder For interpretation, I highly recommend exploring the powerpoint. - https://github.com/PacificBiosciences/stsPlots/blob/master/stsPlots_Usage.pptx