A while ago I was asked to give a presentation at my job about using R to create statistical graphics. I had also just read some reviews of the Slidify package in R and I thought it would be extremely appropriate to create my presentation about visualization in R, in R. So I set about breaking in the Slidify package and I've got to give a huge shout out to Ramnath Vaidyanathan who created this package. It was a pleasure to use and at one point I had a question and I posted on his GitHub and he responded helpfully and promptly. What a great guy! Here's a link to the final product posted on RPubs:
http://rpubs.com/rnorberg/3117

Check it out, it's full of gorgeous figures and formatting. It's really an accomplishment to get something this nice looking out of a command line program. Most of the credit for the formatting beauty goes to the Slidify package. I uses markdown in much the same way as knitr does, so if you're familiar with that, using Slidify isn't a big jump.

Some comments on the whole experience:

  1. While Slidify does a nice job with formatting (notice in particular headings, regular text, bullets, and set off code chunks with the different background color), especially for markdown, it's not perfect. For example I had trouble gauging the appropriate amount of material for each slide. If you try to cram too much on one slide, instead of compressing to fit all of it (as PowerPoint would), your text/figure/whatever just hangs off of the bottom of the slide out of sight. It was frustrating to have to guess and check the amount of material in each slide, especially with figures and material I decided to go back and add.
  2. I loved the fact that I could add images from my hard drive, not just images generated in R. You'll notice in my slideshow that I made use of this when I took screen shots of reshapeGUI in action and included them. This was really easy to do and a great feature. I would say the same for hyperlinks. These were easy to add and extremely useful since I ended up publishing the presentation to the web. Those watching didn't have to jot down a link or a book title every time I suggested a good place to find extra info about something. I just emailed around a link to the presentation and everyone had all of the resources just a click away.
  3. This brings me to the most frustrating part of the whole thing: Publishing online. Now this is not an issue with Slidify, but rather with Rpubs and GitHub. I'd never published anything from R directly to either, and setting this functionality up proved extremely painful. I first tried to push the thing to GitHub because I already had a GitHub account, but I never managed to figure it out. After much frustration I tried Rpubs. By pure determination I finally stumbled my way through the setup process for that and eventually published the presentation to my new Rpubs account. I honestly don't even remember what all I had to do, but I remember this being incredibly frustrating. The documentation accompanying Slidify could be improved by adding a how-to section for those who have never uploaded to either online repository. UPDATE: The author of the package pointed out to me that you can also publish your slideshow using Dropbox. This is done simply by saving the slideshow into your Dropbox folder. I wish I'd known this, because you can't get mush easier than that!
In summation, I was really impressed with Slidify, but in the end, there was no reason to use this instead of a traditional PowerPoint (other than to show off that I had made my presentation about R, in R). PowerPoint (or Google Docs or some other free presentation software) would allow me the same functionality and more without having to struggle from behind a command line. The only reason I can see to do a presentation this way is if you might need to update it frequently (such as update regularly with recent data, etc), much the same type of things you might use a markdown document for. The problem with this is as your data changes, your output and figures will change, but your text discussing it will not, which is dangerous. The same goes for markdown documents as well though, and those get used quite frequently. So overall, I highly recommend Slidify, but you need to have the right reasons to do so. Happy presenting!
0

Add a comment

Purpose

The caret package includes a function for data splitting, createTimeSlices(), that creates data partitions using a fixed or growing window. The main arguments to this function, initialWindow and horizon, allow the user to create training/validation resamples consisting of contiguous observations with the validation set always consisting of n = horizon rows. If fixedWindow = TRUE, the training set always has n =initialWindow rows.

Understanding data.table Rolling Joins

Robert Norberg

June 5, 2016

Introduction

Rolling joins in data.table are incredibly useful, but not that well documented. I wrote this to help myself figure out how to use them and perhaps it can help you too.

library(data.table)

The Setup

Imagine we have an eCommerce website that uses a third party (like PayPal) to handle payments.
2

A Custom caret C5.0 Model for 2-Class Classification Problems with Class Imbalance

Robert Norberg

Monday, April 06, 2015

Introduction

In this post I share a custom model tuning procedure for optimizing the probability threshold for class imbalanced data. This is done within the excellent caret package framework and is akin to the example on the package website, but the example shows an extension of therandom forest (or rf) method while I present an extension to the C5.0 method.
3

Getting Data From One Online Source

Robert Norberg

Hello world. It’s been a long time since I posted anything here on my blog. I’ve been busy getting my Masters degree in statistical computing and I haven’t had much free time to blog. But I’ve writing R code as much as ever. Now, with graduation approaching, I’m job hunting and I thought it would be good to put together a few things to show potential employers.
2

Generating Tables Using Pander, knitr, and Rmarkdown

I use a pretty common workflow (I think) for producing reports on a day to day basis. I write them in rmarkdown using RStudio, knit them into .html and .md documents using knitr, then convert the resulting .md file to a .docx file using pander, which is really just a way of communicating with Pandoc via my R terminal.
2

R vs. Perl/mySQL - an applied genomics showdown

Recently I was given an assignment for a class I'm taking that got me thinking about speed in R. This isn't something I'm usually concerned with, but the first time I tried to run my solution (ussing plyr's ddply() it was going to take all night to compute.

Stop Sign Sampling Project

Post 1: Planning Phase

Welcome back to the blog y'all. It's been a while since my last post and I've got some fun stuff for you. I'm currently enrooled in a survey sampling methodology class and we've been given a semester-long project, which I will of course be doing entirely in R. My group's assignment is to estimate the proportion of cars that actually stop at a stop sign in Chapel Hill.
1

A while ago I was asked to give a presentation at my job about using R to create statistical graphics. I had also just read some reviews of the Slidify package in R and I thought it would be extremely appropriate to create my presentation about visualization in R, in R. So I set about breaking in the Slidify package and I've got to give a huge shout out to Ramnath Vaidyanathan who created this package.

In class today we were discussing several types of survey sampling and we split into groups and did a little investigation. We were given a page of 100 rectangles with varying areas and took 3 samples of size 10. Our first was a convenience sample. We just picked a group of 10 rectangles adjacent to each other and counted their area. Next, we took a simple random sample (SRS), numbering the rectangles 1 through 100 and choosing 10 with a random number generator.

For a class I'm taking this semester on genomics we're dealing with some pretty large data and for this reason we're learning to use mySQL. I decided to be a geek and do the assignments in R as well to demonstrate the ability of R to handle pretty large data sets quickly.
My Blog List
My Blog List
Blog Archive
About Me
About Me
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.