This article is aimed towards people who are looking to “break into” the bioinformatics realm and have experience with R (ideally using the tidyverse). Bioinformatics can be a scary-sounding concept (as least it is for me) because it is such a vast and fast-developing field that it can be difficult to define exactly what it is. I’ve always thought that bioinformatics was a highly advanced field beyond what I was capable of doing — that I would need years of technical training to begin actually doing it. …
This article is not meant to be a technical article nor is it meant to be a comprehensive article on all the different methods out there that control Type I and Type II error rates. This article will assume some background knowledge and is primarily focused on motivating a novel paradigm for combatting the multiple hypothesis testing problem and introducing a set of tools in R and R Shiny that you can use.
“Can you describe what’s going on in these Kaplan-Meier curves?” the interviewers asked me. I of course knew what those were, and I was admittedly stunned when they prodded me to say more — I didn’t know what else I could say. So, I stumbled through an answer, and after a while, the interviewers nodded their heads and thanked me for my time.
I didn’t get the job.
I recently participated in a relatively popular Stack Overflow “contest” (what would “popular” even mean on Stack Overflow??), where the prompt was to write a more “elegant”
tidyverse solution to the solution presented.
The problem statement was to perform two regressions: 1)
dep ~ cov_a + cont_a + cont_b and 2)
dep ~ cov_b + cont_a + cont_b.
This was the original posted code:
map(.x = names(df)[grepl("cov_", names(df))],
~ df %>%
mutate(res = map(data, function(y) tidy(lm(dep ~ cont_a + cont_b + !!sym(.x), data = y)))) %>%
and this was the sample dataset provided:
For the past couple of months, I’ve been building a Shiny App that researchers can use to control something called the False Discovery Rate. You can check it out here — I’ll probably write an article about it in the future. Along the way, I learned a lot of cool features from various sources — random Stackexchange posts, Dean Attali’s blog, and Appsilon’s blog to name a few. I’ve decided to list some of them here in this article in no particular order. …
If you’ve been using R for a while now, you may have come across the double “&” operator. Most people who’ve coded before, whether in R or some other language, have an intuitive feel for what the “&” represents. It’s a logical AND statement. “The sky is blue AND cows can fly” is a logically false statement because even though the sky is blue, the second part of the statement is false. So what the heck then does a “&&” represent?
If you look up the help page, using
?"&&", you will read “& and && indicate logical AND…The shorter form…
Data Scientist at Merck. Tidyverse enthusiast and a neRd.