This article is aimed towards people who are looking to “break into” the bioinformatics realm and have experience with R (ideally using the tidyverse). Bioinformatics can be a scary-sounding concept (as least it is for me) because it is such a vast and fast-developing field that it can be difficult to define exactly what it is. I’ve always thought that bioinformatics was a highly advanced field beyond what I was capable of doing — that I would need years of technical training to begin actually doing it. …
This article is aimed towards people who have experience with R (ideally the tidyverse) and want to learn how to start making Shiny apps. For those who haven’t heard of Shiny before, it’s a package that allows you to create web applications using R without needing to know any HTML, CSS, or Javascript. That being said, if you do want to get deeper into app development, learning HTML, CSS, and Javascript will increase your ability to do more powerful things and have more control over the app development process. However, with access to so many tools, it can be overwhelming…
This article is not meant to be a technical article nor is it meant to be a comprehensive article on all the different methods out there that control Type I and Type II error rates. This article will assume some background knowledge and is primarily focused on motivating a novel paradigm for combatting the multiple hypothesis testing problem and introducing a set of tools in R and R Shiny that you can use.
If you’ve ever done statistics or read a research paper about a discovery before, the number 0.05 should ring a bell. It refers to a significance threshold…
“Can you describe what’s going on in these Kaplan-Meier curves?” the interviewers asked me. I of course knew what those were, and I was admittedly stunned when they prodded me to say more — I didn’t know what else I could say. So, I stumbled through an answer, and after a while, the interviewers nodded their heads and thanked me for my time.
I didn’t get the job.
Even though I put survival analysis as one of my skills on my CV — and I had legitimately studied and worked with it — I realized afterwards that I still had…
I recently participated in a relatively popular Stack Overflow “contest” (what would “popular” even mean on Stack Overflow??), where the prompt was to write a more “elegant” dplyr
or tidyverse
solution to the solution presented.
The problem statement was to perform two regressions: 1) dep ~ cov_a + cont_a + cont_b
and 2) dep ~ cov_b + cont_a + cont_b
.
This was the original posted code:
map(.x = names(df)[grepl("cov_", names(df))],
~ df %>%
nest() %>%
mutate(res = map(data, function(y) tidy(lm(dep ~ cont_a + cont_b + !!sym(.x), data = y)))) %>%
unnest(res))
and this was the sample dataset provided:
set.seed(123) df…
For the past couple of months, I’ve been building a Shiny App that researchers can use to control something called the False Discovery Rate. You can check it out here — I’ll probably write an article about it in the future. Along the way, I learned a lot of cool features from various sources — random Stackexchange posts, Dean Attali’s blog, and Appsilon’s blog to name a few. I’ve decided to list some of them here in this article in no particular order. …
If you’ve been using R for a while now, you may have come across the double “&” operator. Most people who’ve coded before, whether in R or some other language, have an intuitive feel for what the “&” represents. It’s a logical AND statement. “The sky is blue AND cows can fly” is a logically false statement because even though the sky is blue, the second part of the statement is false. So what the heck then does a “&&” represent?
If you look up the help page, using?"&&"
, you will read “& and && indicate logical AND…The shorter form…
Data Scientist at Merck. Tidyverse enthusiast and a neRd.