Welcome

Preamble

Thank you for joining us for our mid-program challenge! We at the Programs team congratulate and thank you for your excellent work in our program. We hope that you have had meaningful experiences with us thus far, and we look forward to delivering more high-quality data science material just for you.

—Eddie Guo, Associate Director of Programs  

The Mid-Program Challenge

Today, we will apply our knowledge and skills in statistics and R programming to identify differentially expressed genes (DEGs), a common task in bioinformatics. We will be working on a recently published dataset of gene expression from healthy people as well as COVID-19 patients with mild or severe symptoms (Zhang et al., 2021). Here are our expectations for you:

  1. Understand the workflow (statistics and scripts) used for DEG analysis.

  2. Adapt the code to work on the tasks you are asked to perform.

  3. Interpret the results from tests and figures.

  4. Propose future directions and carry out further exploration based on your conclusions.

Submission

During today’s challenge, you will be given time to work in your group and perform the tasks. You may or may not be able to finish the entire task in session. Therefore, we ask you to complete and finalize your work by Sunday, March 14, 2021 at 10:00 PM MST. Specifically,

  1. For each task, please finish the codes in the designated code blocks. Interpret and discuss your results in the markdown texts as instructed in the task description.

  2. After completing the entire .Rmd document, save the .Rmd file and click the Knit button to generate the .html file, to which point a web page containing all codes, figures and texts should show up nicely.

  3. Submit your .Rmd file and .html file using this Google Form. Download the .Rmd file that you will submit by clicking here. You will also find a few questions asking you to discuss your results from the tasks. The answers should be both filled in the Google Form AND found in the .Rmd document.

Evaluation

We will evaluate the following aspects of your work:

  1. Coding.
    1. Correct usage of commands.
    2. Robustness and adaptability of the code.
    3. Good coding style and convention.
    4. Readability of code (e.g. naming of variables, adequate commenting).
  2. Visualization.
    1. Correct choices of figures.
    2. Compliance with publication standards.
  3. Reasoning and communication.
    1. Thorough understanding of the statistical methods employed at each step.
    2. Adequate discussion and correct interpretation of the results.
    3. Appropriate reference to results in figures/tables.
    4. Appropriate reference to literature.
    5. Proper wording in discussion and interpretation.
    6. Critical thinking and insight.

Tip: using resources outside those provided by Youreka will enhance your project.