Finding the predominant
For a the ceabigr data lets ID which isoform is predominant, such that we can find out how treatment and/or methylation might influence this.
We start with the transcript expression table.
- https://github.com/epigeneticstoocean/2018_L18-adult-methylation/blob/main/data/whole_tx_table.csv
t_id | chr | strand | start | end | t_name | num_exons | length | gene_id | gene_name | cov.S12M | FPKM.S12M | cov.S13M | FPKM.S13M | cov.S16F | FPKM.S16F | cov.S19F | FPKM.S19F | cov.S22F | FPKM.S22F | cov.S23M | FPKM.S23M | cov.S29F | FPKM.S29F | cov.S31M | FPKM.S31M | cov.S35F | FPKM.S35F | cov.S36F | FPKM.S36F | cov.S39F | FPKM.S39F | cov.S3F | FPKM.S3F | cov.S41F | FPKM.S41F | cov.S44F | FPKM.S44F | cov.S48M | FPKM.S48M | cov.S50F | FPKM.S50F | cov.S52F | FPKM.S52F | cov.S53F | FPKM.S53F | cov.S54F | FPKM.S54F | cov.S59M | FPKM.S59M | cov.S64M | FPKM.S64M | cov.S6M | FPKM.S6M | cov.S76F | FPKM.S76F | cov.S77F | FPKM.S77F | cov.S7M | FPKM.S7M | cov.S9M | FPKM.S9M |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | NC_007175.2 | + | 1 | 1623 | gene-COX1 | 1 | 1623 | gene-COX1 | COX1 | 197.261856 | 230.708456 | 38.658657 | 63.109 | 5144.539062 | 1882.768525 | 8613.396484 | 3234.683313 | 8851.949219 | 3011.082644 | 99.887238 | 157.223424 | 3652.243896 | 2097.983334 | 50.569931 | 85.725339 | 4143.192383 | 2031.465213 | 5815.194336 | 2720.424882 | 6334.129883 | 2424.161901 | 4264.319336 | 1729.175357 | 5067.318359 | 2416.769825 | 9989.254883 | 3956.071235 | 101.027107 | 75.18694 | 5948.057129 | 2672.94828 | 6176.420898 | 3143.717701 | 5888.288574 | 2700.855757 | 3855.770508 | 2012.004551 | 3341.584717 | 2103.869062 | 1460.471313 | 1161.543557 | 107.976585 | 110.668782 | 5490.592285 | 1914.960388 | 8571.107422 | 2570.933145 | 189.353043 | 239.215973 | 98.109673 | 110.63652 |
2 | NC_007175.2 | + | 1710 | 8997 | rna-NC_007175.2:1710..8997 | 2 | 1469 | . | . | 2242.554199 | 2622.788958 | 96.919327 | 158.217649 | 109415.523438 | 40043.257756 | 234398.78125 | 88026.346834 | 170861.4375 | 58120.296015 | 882.683533 | 1389.351931 | 84232.40625 | 48386.194771 | 137.82959 | 233.646518 | 142036.0625 | 69642.269357 | 159205.9375 | 74478.644866 | 166901.03125 | 63875.406514 | 228253.34375 | 92556.402562 | 120283.171875 | 57366.977889 | 129583.375 | 51319.249367 | 391.246521 | 291.175602 | 145652.796875 | 65453.707719 | 64273.429688 | 32714.337634 | 125864.453125 | 57731.839833 | 105498.757812 | 55050.989262 | 78691.234375 | 49544.173635 | 104203.140625 | 82874.949689 | 475.213135 | 487.061698 | 237007.40625 | 82661.354364 | 238397.8125 | 71508.24364 | 2060.264404 | 2602.800279 | 628.857788 | 709.151659 |
3 | NC_007175.2 | + | 2645 | 3429 | gene-COX3 | 1 | 785 | gene-COX3 | COX3 | 145.308258 | 169.945901 | 44.127384 | 72.036519 | 3372.623047 | 1234.292994 | 2897.380859 | 1088.085233 | 5521.629883 | 1878.239865 | 117.998726 | 185.731071 | 1978.724854 | 1136.652394 | 48.717201 | 82.58462 | 2143.769531 | 1051.120205 | 3527.251953 | 1650.095152 | 2362.450928 | 904.143685 | 2339.378418 | 948.614583 | 2176.965332 | 1038.265953 | 6731.983887 | 2666.085521 | 104.224213 | 77.566307 | 3073.633057 | 1381.23458 | 3290.355225 | 1674.747906 | 2573.182129 | 1180.273976 | 1955.515747 | 1020.420322 | 2407.568115 | 1515.810162 | 1211.793579 | 963.764924 | 90.784721 | 93.048271 | 2992.1604 | 1043.579334 | 5468.978027 | 1640.438765 | 191.13121 | 241.46239 | 91.805099 | 103.526965 |
4 | NC_007175.2 | + | 3430 | 3495 | rna-NC_007175.2:3430..3495 | 1 | 66 | . | . | 64.439392 | 75.365369 | 33.803032 | 55.18235 | 220.757568 | 80.791573 | 191.015152 | 71.734016 | 481.545441 | 163.802693 | 52.378788 | 82.44469 | 300.863647 | 172.827154 | 30.969696 | 52.499334 | 344.636353 | 168.980027 | 304.424255 | 142.413696 | 302.984863 | 115.956631 | 242.969696 | 98.523862 | 249.303024 | 118.900764 | 339.5 | 134.45309 | 46.045456 | 34.268198 | 280.787872 | 126.180943 | 217.060608 | 110.481019 | 305.757568 | 140.245689 | 317.454559 | 165.653017 | 227.454544 | 143.205879 | 135.15152 | 107.488847 | 51.439392 | 52.721938 | 533.909119 | 186.212117 | 780.348511 | 234.068219 | 100.803032 | 127.34781 | 40.348484 | 45.500262 |
5 | NC_007175.2 | + | 3499 | 3567 | rna-NC_007175.2:3499..3567 | 1 | 69 | . | . | 72.623192 | 84.936768 | 59.855072 | 97.711458 | 249.289856 | 91.233654 | 177.043472 | 66.487078 | 486.405823 | 165.456002 | 53.347828 | 83.969968 | 374.376801 | 215.055815 | 27.52174 | 46.654414 | 410.811584 | 201.426669 | 332.144928 | 155.381794 | 353.159424 | 135.159152 | 268.478271 | 108.867553 | 306.753632 | 146.300838 | 273.289856 | 108.23171 | 63.13044 | 46.983277 | 242.08696 | 108.78946 | 203.492752 | 103.575157 | 315.144928 | 144.551508 | 378.652161 | 197.586934 | 198.08696 | 124.715984 | 194.594208 | 154.764867 | 82.333336 | 84.386166 | 691.956543 | 241.334505 | 921.637695 | 276.448396 | 115.101448 | 145.411473 | 55.260868 | 62.316691 |
reading it in
tx_exp <- read.csv("https://raw.githubusercontent.com/epigeneticstoocean/2018_L18-adult-methylation/main/data/whole_tx_table.csv")
taking the entire data set
tx_exp %>%
select(starts_with(c("gene_name", "t_name", "FPKM"))) %>%
pivot_longer(cols = c(3:28)) %>%
group_by(gene_name, t_name) %>%
summarise(Predom_exp = mean(value)) %>%
group_by(gene_name) %>%
slice(which.max(Predom_exp))
warning message
`summarise()` has grouped output by 'gene_name'. You can override using the `.groups` argument.
https://raw.githubusercontent.com/sr320/ceabigr/main/output/42-predominant-isoform/predom_iso-all.txt
next up will do this for each and every comparison, join, and see if the predominant isoform changes.
Written on September 14, 2022