<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Aastat</title>
	<atom:link href="https://www.aastat.com/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.aastat.com/</link>
	<description></description>
	<lastBuildDate>Thu, 19 Aug 2021 09:59:00 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.6.2</generator>

<image>
	<url>https://www.aastat.com/wp-content/uploads/2020/09/cropped-fav-32x32.png</url>
	<title>Aastat</title>
	<link>https://www.aastat.com/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Creating advanced figures in R</title>
		<link>https://www.aastat.com/blog/creating-advanced-figures/</link>
		
		<dc:creator><![CDATA[Aastat]]></dc:creator>
		<pubDate>Mon, 21 Sep 2020 09:30:14 +0000</pubDate>
				<category><![CDATA[Blog]]></category>
		<guid isPermaLink="false">https://www.aastat.com/?p=227</guid>

					<description><![CDATA[<p>In this blog post we are replicating the picture below which was originally created in SAS Let’s first generate some dummy data: We have a patient number, treatment group and change in tumor size in our dataset. We also have collected some biomarkers so we may inspect if we find some interesting correlations. In the [&#8230;]</p>
<p>Artikkeli <a href="https://www.aastat.com/blog/creating-advanced-figures/">Creating advanced figures in R</a> julkaistiin ensimmäisen kerran <a href="https://www.aastat.com">Aastat</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<hr class="wp-block-separator"/>



<h2 class="wp-block-heading">In this blog post we are replicating the picture below which was originally created in SAS</h2>



<figure class="wp-block-image size-full is-style-default"><img fetchpriority="high" decoding="async" width="640" height="480" src="https://www.aastat.com/wp-content/uploads/2021/08/sas_plot-1.png" alt="" class="wp-image-378" srcset="https://www.aastat.com/wp-content/uploads/2021/08/sas_plot-1.png 640w, https://www.aastat.com/wp-content/uploads/2021/08/sas_plot-1-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /></figure>



<h2 class="wp-block-heading">Let’s first generate some dummy data:</h2>



<pre class="wp-block-code"><code>library(tidyverse)
n_pat &lt;- 25
patient &lt;- 1:n_pat
treatment &lt;- sample(c("Drug A", "Drug B"), n_pat, replace=TRUE)
change &lt;- rnorm(n_pat, 0, 20)
biomarkers &lt;- c("T790M","Ex19del","L959R","Ex20Ins","MET","ERBB2","EGFR",
                "EGFR2","PIK3CA","KRAS","CDKN2","RB1","ALK","KIT","MET2",
                "Other")
genes &lt;- matrix(sample(x=c("CC", "AA", "AC"), replace=TRUE, size=n_pat * length(biomarkers)),
                nrow=n_pat, ncol=length(biomarkers))
biomarker_groups &lt;- c(rep("Baseline", 4), rep("SCNA", 3), rep("SNV", 9))
df &lt;- data.frame(patient, treatment, change, genes)
colnames(df) &lt;- c("patient", "treatment", "change", biomarkers)
head(df)</code></pre>



<pre class="wp-block-code"><code>##    patient treatment      change T790M Ex19del L959R Ex20Ins MET ERBB2 EGFR EGFR2 PIK3CA KRAS CDKN2 RB1 ALK KIT MET2 Other
## 1        1    Drug B  17.3058647    AC      AC    AC      CC  AA    AC   AA    CC     AC   AC    AC  AC  CC  AC   CC    CC
## 2        2    Drug B   8.1824572    AC      CC    CC      AC  CC    AC   AA    AC     AC   CC    AA  AA  CC  CC   AA    AC
## 3        3    Drug B -18.5752930    AA      AA    AC      CC  AC    AA   AC    AA     AA   AA    AA  AC  AC  AA   AA    AC
## 4        4    Drug A  -5.2139298    AC      AA    AC      AA  AA    AA   CC    AA     AC   AA    CC  AC  AC  CC   CC    AC
## 5        5    Drug B   5.6130694    CC      CC    CC      CC  AA    CC   AA    AC     AC   AA    CC  AA  AC  CC   CC    AC</code></pre>



<p>We have a patient number, treatment group and change in tumor size in our dataset. We also have collected some biomarkers so we may inspect if we find some interesting correlations.</p>



<p>In the picture above we have 3 distinct plots:</p>



<ol class="wp-block-list"><li>The change in tumor size</li><li>Highlighted genes in biomarkers</li><li>Percentage of selected genes from each one of the biomarkers.</li></ol>



<h3 class="wp-block-heading">First plot</h3>



<p>Plot is fairly standard barplot but there is some notable options that we need to set. First of all we notice that there is text indicating the change in tumor size outside of bars. The second thing we notice that the x-axis ticks are not just numbers but there is a custom string indicating that the ticks represent patients.</p>



<p>We can add the text to bar plots using geom_text but if you try it with only these we notice that the plot is not the most aesthetic. The stat = “identity” in the geom_bar means that we are providing our own values so the function is not trying to plot counts or something else.</p>



<pre class="wp-block-code"><code>df %&gt;% 
  ggplot(aes(x=factor(patient), y=change, label=change, fill=treatment)) +
  geom_bar(stat="identity") +
  geom_text()</code></pre>



<figure class="wp-block-image size-full"><img decoding="async" width="672" height="480" src="https://www.aastat.com/wp-content/uploads/2021/08/messy_picture.png" alt="" class="wp-image-381" srcset="https://www.aastat.com/wp-content/uploads/2021/08/messy_picture.png 672w, https://www.aastat.com/wp-content/uploads/2021/08/messy_picture-300x214.png 300w" sizes="(max-width: 672px) 100vw, 672px" /></figure>



<p>We can clip the text, rotate its angle and add some vertical adjustment to it so they line up nicely outside of the bar instead of at the edges.<br>The</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>hjust = ifelse(change &lt; 0, 1.1, -0.3)</p></blockquote>



<p>indicates that the horizontal adjustment goes to above the bar if the change is positive and respectively to the bottom of the bar if the change is negative.</p>



<p>With the x-axis ticks we can change the text with scale_x_discrete. For label values we paste the string “pat” and corresponding number together using paste0 command. And finally I’m changing the colors from light to darker to indicate importance.</p>



<pre class="wp-block-code"><code>p1 &lt;- df %&gt;%
    ggplot(aes(x=factor(patient), y=change)) +
    geom_bar(stat = "identity", aes(fill=factor(treatment))) +
    geom_text(aes(label=formatC(change, format="f", digits=0)),
              hjust=ifelse(change &lt; 0, 1.1, -0.3), angle=90,
              vjust=0.35) +
    theme(axis.title.x = element_blank(),
          panel.grid = element_blank(),
          axis.text.x = element_text(angle=45, hjust=1)) +
    ylab("Change from baseline (%)") +
    labs(fill="Treatment") +
    scale_fill_manual(values = c("#0044ba", "#9e181c")) +
    scale_x_discrete(labels = paste0("pat", patient)) + 
    ylim(min(change) - 10, max(change) + 10)
p1</code></pre>



<figure class="wp-block-image size-full"><img decoding="async" width="672" height="480" src="https://www.aastat.com/wp-content/uploads/2021/08/first_plot.png" alt="" class="wp-image-380" srcset="https://www.aastat.com/wp-content/uploads/2021/08/first_plot.png 672w, https://www.aastat.com/wp-content/uploads/2021/08/first_plot-300x214.png 300w" sizes="(max-width: 672px) 100vw, 672px" /></figure>



<h3 class="wp-block-heading">Second plot</h3>



<p>First we need to create a dataframe were we have all the biomarkers, percentages out of those that have the selected gene and grouping for the biomarker.</p>



<pre class="wp-block-code"><code>genes_df &lt;- df %&gt;%
    select(all_of(biomarkers))

pcts &lt;- colSums(genes_df == "CC") / length(df)

gene_pct_df &lt;- data.frame(pcts, biomarker_groups, biomarkers)
gene_pct_df</code></pre>



<pre class="wp-block-code"><code>##              pcts biomarker_groups biomarkers
## T790M   0.3157895         Baseline      T790M
## Ex19del 0.4210526         Baseline    Ex19del
## L959R   0.3684211         Baseline      L959R
## Ex20Ins 0.7368421         Baseline    Ex20Ins
## MET     0.3684211             SCNA        MET
## ERBB2   0.4210526             SCNA      ERBB2
## EGFR    0.4736842             SCNA       EGFR
## EGFR2   0.2105263              SNV      EGFR2
## PIK3CA  0.4210526              SNV     PIK3CA
## KRAS    0.1578947              SNV       KRAS
## CDKN2   0.3684211              SNV      CDKN2
## RB1     0.4210526              SNV        RB1
## ALK     0.4210526              SNV        ALK
## KIT     0.5789474              SNV        KIT
## MET2    0.3157895              SNV       MET2
## Other   0.4210526              SNV      Other</code></pre>



<p>For the next plot I am using helper function percent that converts the decimal to percentages and adds the percentage sign</p>



<pre class="wp-block-code"><code>percent &lt;- function(x, digits = 2, format = "f", is.float=TRUE,...) {
  paste0(formatC(100 * x, format = format, digits = digits, ...), "%")
}</code></pre>



<p>This plot is similar to the first one but we have few notable differences. First of all the bars are horizontal instead of vertical.<br>Secondly the ticks from the axis are removed so they are not interfering with the other plots</p>



<p>First we create the plot with original rotation and at the last step we used coord_flip to flip it sideways. Changing the tick labels is done by modifying the underlying theme. Generally to move something from the plot we use “theme(something = element_blank())”.</p>



<p>One more thing that is absolutely necessary is to order the bars by groups. For this we firt create variable bio_factor that is just<br>numbers 1-3 according to which group they belong. Using this variable we can order the x-axis (later y-axis) by groups.</p>



<pre class="wp-block-code"><code>p2 &lt;- gene_pct_df %&gt;%
  mutate(bio_factor = as.numeric(factor(biomarker_groups))) %&gt;%
  ggplot(aes(x=reorder(biomarkers, -bio_factor), y=pcts)) +
  geom_bar(stat="identity", aes(fill=biomarker_groups), show.legend = F) +
  geom_text(aes(label=percent(pcts, digits=0)), hjust = -0.2, size=3) +
  theme(axis.title.x = element_blank(),
        axis.ticks.y = element_blank(),
        axis.title.y = element_blank(),
        axis.ticks = element_blank(),
        axis.text = element_blank(),
        panel.grid = element_blank()) +
  ylim(0, max(pcts) + 0.3) +
  coord_flip()
p2</code></pre>



<figure class="wp-block-image size-full is-style-default"><a href="https://www.aastat.com/wp-content/uploads/2021/08/second_plot.png"><img loading="lazy" decoding="async" width="672" height="480" src="https://www.aastat.com/wp-content/uploads/2021/08/second_plot.png" alt="" class="wp-image-382" srcset="https://www.aastat.com/wp-content/uploads/2021/08/second_plot.png 672w, https://www.aastat.com/wp-content/uploads/2021/08/second_plot-300x214.png 300w" sizes="(max-width: 672px) 100vw, 672px" /></a></figure>



<h3 class="wp-block-heading">Last plot</h3>



<p>This is the most complicated plot out of all three. In this plot there is a grid that is divided into subgroups by the biomarker groups. Certain grids with specific genes are colored differently than the others.</p>



<p>Before we use plotting functions we again add the bio_factor variable (as we did in the last step) and add color_scheme variable that tells what color each one of the cells should be.</p>



<p>We are creating the grid with geom_raster (we could also use geom_rect but according to documentation geom_raster is preferred when we have even sized squares) and adding the text as usually with the geom_text. To get the grouping working correctly we need to use facet_grid to break the plot into smaller grids and add options so the grids are closer together (I encourage you to copy the code and see what each one of the options does)</p>



<pre class="wp-block-code"><code>p3 &lt;- df %&gt;%
    pivot_longer(cols=all_of(biomarkers)) %&gt;%
    left_join(., gene_pct_df, by=c("name" = "biomarkers")) %&gt;%
    mutate(bio_factor = as.numeric(factor(biomarker_groups))) %&gt;%
    mutate(color_scheme = case_when(
      value == "CC" &amp; bio_factor == 1 ~ "a",
      value == "CC" &amp; bio_factor == 2 ~ "b",
      value == "CC" &amp; bio_factor == 3 ~ "c",
      TRUE ~ "d")) %&gt;%
    ggplot(aes(x = factor(patient), y=reorder(name, bio_factor))) +
    geom_raster(aes(fill=color_scheme,
                    alpha=color_scheme),
                show.legend = F) +
    geom_text(aes(label=value), size=3,
              show.legend = F) +
    facet_grid(biomarker_groups ~ ., switch = "both", scales="free_y",
               space = "free_y") +
    scale_fill_manual(values = c("#F8766D", "#00BA38" ,"#619CFF", "white")) +
    scale_alpha_manual(values = c(0.9, 0.9, 0.9, 0.4)) +
    theme(axis.title.x=element_blank(),
          axis.ticks.y=element_blank(),
          axis.title.y=element_blank(),
          axis.text.x=element_blank(),
          axis.ticks = element_blank(),
          legend.title = element_blank(),
          panel.grid = element_blank(),
          panel.spacing.y = unit(-0.1, "lines"))
p3</code></pre>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="672" height="480" src="https://www.aastat.com/wp-content/uploads/2021/08/third_plot.png" alt="" class="wp-image-383" srcset="https://www.aastat.com/wp-content/uploads/2021/08/third_plot.png 672w, https://www.aastat.com/wp-content/uploads/2021/08/third_plot-300x214.png 300w" sizes="(max-width: 672px) 100vw, 672px" /></figure>



<h3 class="wp-block-heading">Combining the plots</h3>



<p>Now all there is left to this is to combine all three plots so that all the columns and rows are lined up. For this we are using library called cowplot. According to documentation of cowplot it is a library that</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>“provides various features that help with creating publication-quality<br>figures, such as a set of themes, functions to align plots and arrange<br>them into complex compound figures, and functions that make it easy to<br>annotate plots and or mix plots with images.”</p></blockquote>



<p>Function plot_grid from cowplot package is used for creating table like layouts of plots. We can spesify how the plots are arranged and aligned using arguments ncol, nrow, align and axis.</p>



<p>First we need to remove the legend from the first plot so the aligning works better and add it back later.</p>



<pre class="wp-block-code"><code>library(cowplot)
p1_legend &lt;- get_legend(p1)
p1 &lt;- p1 + theme(legend.position = "none")</code></pre>



<p>Here we saved the legend from the first plot into variable called legend and set the legend hidden in the original plot. Now we are going to do nested plot_grid.</p>



<ol class="wp-block-list"><li>First we align change plot with the geneplot</li><li>Second we align the legend from the first plot with the barplot</li><li>We align the first two plots adjust the width of the plots usign<br>rel_widths argument so that plots on the left are larger than plots<br>on the right side.</li><li>And finally we draw the aligned plot using ggdraw function.</li></ol>



<pre class="wp-block-code"><code>ggdraw(plot_grid(
    plot_grid(p1, p3, ncol=1, align = "v", axis="lr"),
    plot_grid(p1_legend, p2, ncol=1),
    rel_widths = c(1, 0.2)
  ))</code></pre>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="672" height="480" src="https://www.aastat.com/wp-content/uploads/2021/08/final_plot.png" alt="" class="wp-image-384" srcset="https://www.aastat.com/wp-content/uploads/2021/08/final_plot.png 672w, https://www.aastat.com/wp-content/uploads/2021/08/final_plot-300x214.png 300w" sizes="(max-width: 672px) 100vw, 672px" /></figure>



<p>Finally we have created a plot that we tried to mimic.</p>



<p>Code it took to recreate this figure is a bit shorter than the code used for creating it originally in SAS. I will be posting the original SAS code in our GitHub pages and I will update the URL in here after that. One downside is that the plots need quite a bit of extra options and tweaking to get them looking right.</p>



<p>EDIT:<br><a href="https://github.com/Aastat-FI/Aastat-FI.github.io/blob/master/extra_post_material/sas_code.sas">Link to the SAS code</a></p>



<p>Mikael Roto<br>4/8/2021</p>
<p>Artikkeli <a href="https://www.aastat.com/blog/creating-advanced-figures/">Creating advanced figures in R</a> julkaistiin ensimmäisen kerran <a href="https://www.aastat.com">Aastat</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>PDF compiler</title>
		<link>https://www.aastat.com/blog/pdf-compiler/</link>
		
		<dc:creator><![CDATA[Aastat]]></dc:creator>
		<pubDate>Mon, 21 Sep 2020 09:29:56 +0000</pubDate>
				<category><![CDATA[Blog]]></category>
		<guid isPermaLink="false">https://www.aastat.com/?p=225</guid>

					<description><![CDATA[<p>Goal of this post is to introduce a small program developed by Aastat Background and motivation In medical datascience we occasionally must send data to FDA. Usually the data is parsedfrom tens or hundreds of invidual *.txt or *.rtf files and manually added together usingsome text editing which is usually microsoft word. This approach usually [&#8230;]</p>
<p>Artikkeli <a href="https://www.aastat.com/blog/pdf-compiler/">PDF compiler</a> julkaistiin ensimmäisen kerran <a href="https://www.aastat.com">Aastat</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<hr class="wp-block-separator"/>



<p></p>



<h2 class="wp-block-heading"></h2>



<p></p>



<p></p>



<h1 class="wp-block-heading">Goal of this post is to introduce a small program developed by Aastat</h1>



<h3 class="wp-block-heading">Background and motivation</h3>



<p>In medical datascience we occasionally must send data to FDA. Usually the data is parsed<br>from tens or hundreds of invidual *.txt or *.rtf files and manually added together using<br>some text editing which is usually microsoft word. This approach usually takes hours, it<br>is very suspectible to manual erros and FDA standards require that there must be table of<br>contents page with hyperlinks.</p>



<h3 class="wp-block-heading">Light at the end of the tunnel</h3>



<p>The solution for this is to automate it all away. I coded up a script in Python that completely<br>automates the process. The user doesn&#8217;t need to do anything other than select the files, change<br>few options depending on the layout and structure of provided files and then press the compile<br>button that outputs the document with the automatically generated table of contents.</p>



<p>And the best of all this that the program is completely free and open source so you can edit the<br>code, see what code others have written and if you&#8217;d like to contribute more features we&#8217;d love that</p>



<h3 class="wp-block-heading">But I don&#8217;t have Python installed or resources to learn it</h3>



<p>No worries. I built in all the dependencies into one *.exe file that provides Python and all the<br>required libraries. All you need to do is to download the project from <a href="https://github.com/Aastat-FI/PDF_Converter">github</a>,<br>extract the files from the packed file and start up the Creator.exe (name might change later or)</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="550" src="https://www.aastat.com/wp-content/uploads/2021/08/download-1024x550.png" alt="" class="wp-image-389" srcset="https://www.aastat.com/wp-content/uploads/2021/08/download-1024x550.png 1024w, https://www.aastat.com/wp-content/uploads/2021/08/download-300x161.png 300w, https://www.aastat.com/wp-content/uploads/2021/08/download-768x412.png 768w, https://www.aastat.com/wp-content/uploads/2021/08/download-1536x825.png 1536w, https://www.aastat.com/wp-content/uploads/2021/08/download.png 1920w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<h3 class="wp-block-heading">Okay I got the files, but what do all these options mean?</h3>



<p>The amount of options looks intimidating and most of the names are not that informative what the setting<br>does fortunately there is HUGE documentation in the <a href="https://github.com/Aastat-FI/PDF_Converter">github main page</a>. There you can find everything<br>you need to know what the settings do, what you need to set before running the program and how the<br>program works behind the scenes.</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="703" height="643" src="https://www.aastat.com/wp-content/uploads/2021/08/settings.png" alt="" class="wp-image-390" srcset="https://www.aastat.com/wp-content/uploads/2021/08/settings.png 703w, https://www.aastat.com/wp-content/uploads/2021/08/settings-300x274.png 300w" sizes="(max-width: 703px) 100vw, 703px" /></figure>



<p>If you encounter any bugs or problems you can contact Aastat and we might fix them in the future. In<br>case you know how to program in Python and want to contribute to the project create a pull request on<br>github and we&#8217;ll check it out!</p>



<p>Mikael Roto<br>10/8/2021</p>
<p>Artikkeli <a href="https://www.aastat.com/blog/pdf-compiler/">PDF compiler</a> julkaistiin ensimmäisen kerran <a href="https://www.aastat.com">Aastat</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>More figures in R</title>
		<link>https://www.aastat.com/blog/more-figures-in-r/</link>
		
		<dc:creator><![CDATA[Aastat]]></dc:creator>
		<pubDate>Mon, 21 Sep 2020 09:29:41 +0000</pubDate>
				<category><![CDATA[Blog]]></category>
		<guid isPermaLink="false">https://www.aastat.com/?p=223</guid>

					<description><![CDATA[<p>In this blog post we are making the picture below. We will be using library called tidyverse in this tutorial. Tidyverse is a collection ofpackages that share underlying design philosophy, grammar and data structures. Dplyr fromtidyverse provides useful &#8220;pipes&#8221; that allows piping data forward into another expressionor funtion call. You can find more information about [&#8230;]</p>
<p>Artikkeli <a href="https://www.aastat.com/blog/more-figures-in-r/">More figures in R</a> julkaistiin ensimmäisen kerran <a href="https://www.aastat.com">Aastat</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading">In this blog post we are making the picture below.</h2>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="647" height="484" src="https://www.aastat.com/wp-content/uploads/2021/08/plot_final.png" alt="" class="wp-image-392" srcset="https://www.aastat.com/wp-content/uploads/2021/08/plot_final.png 647w, https://www.aastat.com/wp-content/uploads/2021/08/plot_final-300x224.png 300w" sizes="(max-width: 647px) 100vw, 647px" /></figure>



<p>We will be using library called tidyverse in this tutorial. Tidyverse is a collection of<br>packages that share underlying design philosophy, grammar and data structures. Dplyr from<br>tidyverse provides useful &#8220;pipes&#8221; that allows piping data forward into another expression<br>or funtion call.</p>



<pre class="wp-block-code"><code>library(tidyverse)</code></pre>



<p>You can find more information about tidyverse and its other packages from online documentation.<br>Lets first generate some data to work with that we can use in our figure.</p>



<pre class="wp-block-code"><code>n_pat &lt;- 25
patient &lt;- 1:n_pat
censoring &lt;- ceiling(rexp(n_pat, 1/30))
tumor_shrink &lt;- (rbeta(n_pat, 2, 2) - 0.5) * 100

n_parameters &lt;- 15
parameters &lt;- paste("Parameter", 1:n_parameters)

response &lt;- sample(c("PR", "NE", "CR", "PD", "SD"), size=n_pat,
                   replace = T)

missing_combination &lt;- sample(c(TRUE, FALSE), size=n_pat, replace=T, prob = c(0.1, 0.9))

changes &lt;- matrix(runif(n_pat * n_parameters, 1, 100), nrow=n_pat, ncol=n_parameters)
changes&#91;sample(1:dim(changes)&#91;1], 4, replace = FALSE), sample(1:dim(changes)&#91;2], 5, replace = F)] &lt;- NA

df &lt;- data.frame(patient, censoring, tumor_shrink, changes, missing_combination)
colnames(df) &lt;- c("patient", "censoring", "tumor_shrink", parameters, "missing_combination")
head(df)</code></pre>



<pre class="wp-block-code"><code>##    patient censoring tumor_shrink Parameter 1 Parameter 2 Parameter 3 Parameter 4
## 1        1        50   -33.5020150  76.692716  12.905816  51.320504    6.95165   
## 2        2        18   -34.3932674  71.841917  94.354270   4.175872   40.83416   
## 3        3        19    25.5744672         NA         NA  75.877590   51.54885   
## 4        4        10     4.2591308  90.204811  36.336677  39.754126   72.06269   
## 5        5        14    -8.4798810  33.499890  13.695571  28.529885   87.61651   

##  Parameter 5 Parameter 6 Parameter 7 Parameter 8 Parameter 9 Parameter 10
##      3.702113  80.529008  52.739191  35.523220  20.034390   98.995443    
##     88.104306  95.018387  86.157191  44.547260  66.223263    4.477640    
##     59.516219  47.779858  22.964046  20.790171  27.846610   46.499506    
##      1.586673  60.106080  40.002346  47.315590  56.189063   78099096     
##     99.882826  71.494717  60.329041  58.260342  51.893355   78.442637    

## Parameter 11 Parameter 12 Parameter 13 Parameter 14 Parameter 15 missing_combination
## 5    2.34731   89.944140   22.969047   86.286218   37.865428     FALSE
##     19.13054   57.357747   66.792806   57.220612   71.090477     FALSE
##           NA   42.132381   26.674702          NA          NA     TRUE
##     46.84222    6.844924   80.998685   77.085822   38.931028     FALSE
##     41.88079   75.042574   58.337938   78.939537    1.698262     TRUE






##    patient censoring tumor_shrink Parameter 1 Parameter 2 Parameter 3 Parameter 4  Parameter 5 Parameter 6 Parameter 7 Parameter 8 Parameter 9 Parameter 10  Parameter 11 Parameter 12 Parameter 13 Parameter 14 Parameter 15 
## 1        1        50   -33.5020150  76.692716  12.905816  51.320504    6.95165    3.702113  80.529008  52.739191  35.523220  20.034390   98.995443     52.34731   89.944140   22.969047   86.286218   37.865428
## 2        2        18   -34.3932674  71.841917  94.354270   4.175872   40.83416   78.104306  95.018387  86.157191  44.547260  66.223263    4.477640     19.13054   57.357747   66.792806   57.220612   71.090477
## 3        3        19    25.5744672         NA         NA  75.877590   51.54885   59.516219  47.779858  22.964046  20.790171  27.846610   46.499506           NA   42.132381   26.674702          NA          NA
## 4        4        10     4.2591308  90.204811  36.336677  39.754126   72.06269    1.586673  60.106080  40.002346  47.315590  56.189063   78.099096     46.84222    6.844924   80.998685   77.085822   38.931028
## 5        5        14    -8.4798810  33.499890  13.695571  28.529885   87.61651   99.882826  71.494717  60.329041  58.260342  51.893355   78.442637     41.88079   75.042574   58.337938   78.939537    1.698262
      missing_combination
## 1                FALSE
## 2                FALSE
## 3                 TRUE
## 4                FALSE
## 5                 TRUE</code></pre>



<p>Here we have generated a dataframe containing example patients, how their tumor size has changed from start of the study until the end of study and time after they were censored from the study (quit, died, etc). On top of that we also have measurements on different anonymized parameters noted by Parameter [number]. Note that data has missing values indicated by NA. In the dataframe there is a column called missing_combination which indicates that there was problems while gathering the data. TRUE values indicates problems and FALSE values indicate the data is gathered fine. Note that if you try to replicate the code you may get different results. You can set seed using set.seed(&#8220;Seed number&#8221;) so the data will stay same from run to run</p>



<p>The figure consists of four individual plots. Three smaller plots stacked on top of each other and larger plot under those three. Lets create the top most plot first.</p>



<pre class="wp-block-code"><code>p1 &lt;- df %&gt;%
  mutate(color = case_when(
    response == "PR" ~ "lightgreen",
    response == "NE" ~ "white",
    response == "CR" ~ "darkgreen",
    response == "PD" ~ "red",
    response == "SD" ~ "yellow"
  )) %&gt;% 
  arrange(tumor_shrink) %&gt;% 
  mutate(patient = factor(patient, levels=patient)) %&gt;% 
  ggplot(aes(x=patient, y=1, fill=color)) +
  geom_raster() +
  geom_tile(color="black", size=1) +
  geom_text(aes(label=response), size=3) +
  theme(axis.text = element_blank(),
        axis.ticks = element_blank(),
        axis.title.x = element_blank(),
        legend.position = "none",
        axis.title.y = element_text(angle = 0, vjust=0.57, size = 12),
        plot.margin = unit(c(5, 0, 0, 0), "pt")) +
  scale_fill_identity() +
  labs(y="Best ov. resp") +
  coord_fixed()
p1</code></pre>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="610" height="49" src="https://www.aastat.com/wp-content/uploads/2021/08/plot1.png" alt="" class="wp-image-393" srcset="https://www.aastat.com/wp-content/uploads/2021/08/plot1.png 610w, https://www.aastat.com/wp-content/uploads/2021/08/plot1-300x24.png 300w" sizes="(max-width: 610px) 100vw, 610px" /></figure>



<p>Everything else looks pretty standard except the arrange() and mutate(). We want to sort our patients by their growth of their tumor. First we arrange them by the change of size in their tumors and after that we modify the patient column. This changes from integer into ordinal. Main point of this is that ggplot fills its value in (0, 1) instead of (0.5, 1.5). We also could have used only ggplot(aes(x = factor(patient))) but in the later plot we also need the numerical value. So for the consistency we use this approach.</p>



<p>scale_fill_identity() is useful when you want to set the colors manually using mutate and if/else conditions.</p>



<p>The second and third plot are fairly similar to the first one. Again we are using “hacks” to get our plot looking correct. We pass the patients as x-values and keep the y-value at constant 1. In each square we plot value that we want to plot (censoring), pass the colors in aes(…, fill=color) and finally create the black lines around the square with geom_tile.</p>



<p>Onto the next plot!</p>



<pre class="wp-block-code"><code>p2 &lt;- df %&gt;%
  arrange(tumor_shrink) %&gt;% 
  mutate(patient = factor(patient, levels=patient)) %&gt;% 
  mutate(color = ifelse(missing_combination, "white", "gray")) %&gt;% 
  ggplot(aes(x=factor(patient), y=1, fill=color)) +
  geom_raster() +
  geom_tile(color="black", size=1) +
  geom_text(aes(label=censoring), size=3) +
  theme(axis.text = element_blank(),
        axis.title = element_blank(),
        axis.ticks = element_blank(),
        legend.position = "none",
        axis.title.y = element_text(angle = 0, vjust=0.57, size=12),
        plot.margin = unit(c(-5, 0, 0, 0), "pt")) +
  scale_fill_identity() + 
  labs(y="Censoring")+
  coord_fixed()
p2</code></pre>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="608" height="66" src="https://www.aastat.com/wp-content/uploads/2021/08/plot2.png" alt="" class="wp-image-394" srcset="https://www.aastat.com/wp-content/uploads/2021/08/plot2.png 608w, https://www.aastat.com/wp-content/uploads/2021/08/plot2-300x33.png 300w" sizes="(max-width: 608px) 100vw, 608px" /></figure>



<p>This plot is again similar to the previous two and the code seems self explatory if you understood how to make the first two. Main differences in this section are modifying the scale_fill_gradient() so we get a nice gradient of colors from minimum of tumor_shrink variable to the maximum value.</p>



<pre class="wp-block-code"><code>p3 &lt;- df %&gt;% 
  arrange(tumor_shrink) %&gt;% 
  mutate(patient = factor(patient, levels=patient)) %&gt;% 
  ggplot(aes(x=patient, y=1, fill=tumor_shrink)) +
  geom_raster(alpha=0.8) +
  geom_tile(color="black", size=1) +
  geom_text(aes(label=formatC(tumor_shrink, 0, format="f")), 
            size=3) +
  theme(axis.text = element_blank(),
        axis.title = element_blank(), 
        axis.ticks = element_blank(),
        axis.title.y = element_text(angle = 0, vjust=0.57, size=12),
        plot.margin = unit(c(-5, 0, 0, 0), "pt"),
        legend.position = "none") +
  scale_fill_gradient(low="green", high="red") +
  labs(y="Tumor shrink")+
  coord_fixed()
p3</code></pre>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="626" height="80" src="https://www.aastat.com/wp-content/uploads/2021/08/plot3.png" alt="" class="wp-image-395" srcset="https://www.aastat.com/wp-content/uploads/2021/08/plot3.png 626w, https://www.aastat.com/wp-content/uploads/2021/08/plot3-300x38.png 300w" sizes="(max-width: 626px) 100vw, 626px" /></figure>



<p>Now we need to transform the dataframe into long format and normalize the values to be in the [-100, 100] range. For this we are using function</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="147" src="https://www.aastat.com/wp-content/uploads/2021/08/equation-1024x147.png" alt="" class="wp-image-396" srcset="https://www.aastat.com/wp-content/uploads/2021/08/equation-1024x147.png 1024w, https://www.aastat.com/wp-content/uploads/2021/08/equation-300x43.png 300w, https://www.aastat.com/wp-content/uploads/2021/08/equation-768x110.png 768w, https://www.aastat.com/wp-content/uploads/2021/08/equation.png 1369w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>In the previous equation x&#8217; is the scaled vector of values and x is the original vector. The fraction inside parenthesis normalizes the <em>x</em> values between [0, 1] and then we transform them to desired [-100, 100] range. Here is that as a R function.</p>



<pre class="wp-block-code"><code>normalize &lt;- function(x, na.rm = TRUE) {
  up = x - min(x, na.rm=T)
  down = max(x, na.rm=T) - min(x, na.rm=T)
  return((2 * (up / down) - 1) * 100)
}</code></pre>



<p>In the next block we will pivot the dataframe into long format and apply our normalization function to all non NaN values. We are also creating a column called pat which is factor(patient) but with numerical columns. This was needed so we can sort the values in the last plot with the tumor_shrink values. To add more things to the plot I decided to add markers to the plot that could indicate some importance. For this exercise I have flagged cells that have absolute scaled value higher than 70.</p>



<pre class="wp-block-code"><code>cdf &lt;- df %&gt;% 
  pivot_longer(all_of(parameters)) %&gt;% 
  mutate(scaled_val = normalize(value)) %&gt;% 
  mutate(important = ifelse((abs(scaled_val) &gt; 85), TRUE, FALSE)) %&gt;% 
  replace_na(list(important = FALSE)) %&gt;% 
  arrange(tumor_shrink) %&gt;% 
  mutate(pat=factor(patient, levels = rev(unique(patient)), ordered=TRUE))</code></pre>



<p>With most of the work done with creating the dataframe that we want to plot it is pretty easy to create the plot from that. The plot itself is similar to one created in the <a href="https://aastat-fi.github.io/2020/05/09/advanced_plot.html">previous post</a>.</p>



<pre class="wp-block-code"><code>p4 &lt;- cdf %&gt;% 
  ggplot(aes(x=pat, y=name, fill=scaled_val)) +
  geom_raster(alpha=0.85) +
  geom_text(data=filter(cdf, important), aes(label="★"), colour="black",
            size=8, vjust=0.2, alpha=0.9) +
  scale_fill_gradient2(low="blue", mid="white", high="red", guide="none") +
  scale_x_discrete(labels = paste0("Subj ", unique(cdf$patient))) +
  theme(axis.title = element_blank(),
        panel.grid = element_blank(),
        axis.text.x = element_text(angle=-45, hjust=0.3),
        plot.margin = unit(c(10, 5, 5, 5), "pt")) +
  coord_fixed()
p4</code></pre>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="577" height="355" src="https://www.aastat.com/wp-content/uploads/2021/08/plot4.png" alt="" class="wp-image-397" srcset="https://www.aastat.com/wp-content/uploads/2021/08/plot4.png 577w, https://www.aastat.com/wp-content/uploads/2021/08/plot4-300x185.png 300w" sizes="(max-width: 577px) 100vw, 577px" /></figure>



<p>Now all we need to do is to combine all the plots together. This time there is no need to use cowplot as we can use a bit simpler method from library called Patchwork. Patchwork is a brilliant library that allows joining plot using arithmetic operations. You may have been wondering why we need to specify the plot margins. The three plots on top of the<br>bigger plot are tightly together and to mimic that we need to remove the plot margins.</p>



<pre class="wp-block-code"><code>library(patchwork)
p1 / p2 / p3 / p4</code></pre>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="647" height="484" src="https://www.aastat.com/wp-content/uploads/2021/08/plot_final-1.png" alt="" class="wp-image-398" srcset="https://www.aastat.com/wp-content/uploads/2021/08/plot_final-1.png 647w, https://www.aastat.com/wp-content/uploads/2021/08/plot_final-1-300x224.png 300w" sizes="(max-width: 647px) 100vw, 647px" /></figure>



<p>Mikael Roto<br>14/8/2021</p>
<p>Artikkeli <a href="https://www.aastat.com/blog/more-figures-in-r/">More figures in R</a> julkaistiin ensimmäisen kerran <a href="https://www.aastat.com">Aastat</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
