7 min read

Ice Cream for R Programmers


Turtletools

What and why ice cream is.

This post - perhaps contrary to the headline - will not be about eating ice cream. It will, however, be about something we all do too. About something, we do too often, even though we shouldn’t. About debugging with a print().

Let’s face it; it happens to us. We have a rich arsenal to fight bugs. We have error messages, advanced tracebacks from rlang, classic browser() and debug() functions, logging utilities or IDE built-in breakpoints. And yet. In our code, there happen to be code snippets scattered here and there like the following:

some_advanced_function <- function(x) {
  print("HERE")
  y <- other_function(x)
  print(y)
  more_calculations()
  print("I HOPE IT GETS THERE!!!!")
}

We’re all lazy, and that’s why we use print-driven debugging (and I’m sure I’ll expand on the laziness at another time). (It’s funny that acutely lazy evaluation is an acutely brilliant solution.) All the other solutions mentioned above require a little more effort, whereas adding a print() call to the code is almost effortless. Almost. As it usually turns out, the devil is in the detail.

Since we all do it and we all consider it to be a kind of inappropriate practice, perhaps it would be worth taking some steps to make it more appropriate after all? And at the same time, pay even more tribute to our laziness? That was the idea behind the developer of the Python library IceCream. And this is the idea that guided me and Lewin Appleton-Fox in creating R version of icecream.

How to eat ice cream and why you should do it.

The most basic usage is straightforward. The library’s primary function is ic(), which takes an argument of any kind, prints its value to the screen and returns it invisibly.

library(icecream)

ic(12)
## ℹ ic| `12`: num 12

We can immediately see the first advantage over print: typing ic() is faster than typing “print”. Since we are here because we like to make our life more convenient, that might be important to one.

That would be useless, though, if we could not include it within a function. And of course, we can do it. We will reuse the previous example but replace the print() calls with ic(). And let’s pack the function and other functions into an external file called external.R.

# external.R

other_function <- function(x) x * 5 - 7
more_calculations <- function() NULL

some_advanced_function <- function(x) {
  ic("HERE")
  y <- other_function(x)
  ic(y)
  more_calculations()
  ic("I HOPE IT GETS THERE!!!!")
}
source("external.R")
some_advanced_function(42)
## ℹ ic| `"HERE"`: chr "HERE"
## ℹ ic| `y`: num 203
## ℹ ic| `"I HOPE IT GETS THERE!!!!"`: chr "I HOPE IT GETS THERE!!!!"

That works well. And you can also see another reason to use ic(): it automatically prints expression alongside its value. That might be very useful, especially if we have many prints and don’t want to check which one corresponds to which value.

But, as you have probably noticed, this verbose printing introduces redundancy when printing HERE etc. Well, that is on purpose. With icecream you don’t need to provide a message to get the context. Let’s modify the function slightly and remove those locator messages.

# external_2.R

some_advanced_function_2 <- function(x) {
  ic()
  y <- other_function(x)
  ic(y)
  more_calculations()
  ic()
}
source("external_2.R")
some_advanced_function_2(42)
ℹ ic| `some_advanced_function_2()` in external_2.R:3:2
ℹ ic| `y`: num 203
ℹ ic| `some_advanced_function_2()` in external_2.R:7:2

Now you can see that ic() is even more clever! It includes as precise information about the file and position in the file as possible if you call it without arguments. This context inclusion makes debugging even more seamless!

When the context of the file is not available (e.g. when calling a function created from a console), an environment of the source is printed:

a_function <- function(x) {
  x + 2
  ic()
}

a_function(0)
## ℹ ic| `a_function()` in <env: global>

(Side note: I need to admit that I lied a little with the above code: compiling this code with R markdown does not preserve the file names when sourcing. The output is pasted manually because that would call it from the console. knitr removes references when calling source, and we have not found a way around it. But that is not a concern since Rmd documents are usually final results.)

What is the biggest problem with print-driven debugging? Our memory. It might be a problem if we forget to remove those print()s. Especially when, being irritated, we include some curse words here. And then send it to the client (anticipating questions: I have not done that, but my colleague has…). Thankfully, ic() is help here! It is enough to call:

ic_disable()

to disable all the ic() calls. They behave just like a regular identity() now. There is a counterpart, ic_enable(), which does the opposite. If you are, for example, building a shiny app, you can enable and disable ic() conditionally depending on whether your app is in development or production mode. Additionally, using ic() when building a package is a safety net. If you don’t import the ic() function and rely solely on loading the library externally, the package will not pass checks.

Last and not least, there is a dose of customization possibilities. You can e.g. change the function used to peek and the number of lines printed, as you can see in the chunk below:

# standard settings:
ic(mtcars)
## ℹ ic| `mtcars`: data.frame [32 x 11]: $'mpg': dbl [32], $'cyl': dbl [32], ...
# modified settings:
options(icecream.max.lines = 5)
options(icecream.peeking.function = head)

ic(mtcars)
## ℹ ic| `mtcars`: 
## mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1

A few words on ice cream production.

The inner workings of icecream are pretty curious, and I encourage you heavily to explore the code if you want to see rlang in action. I will comment on a few code snippets to give you a glimpse.

Using ic() is possible because we can suspend the evaluation of arguments:

ic <- function(x) {
  # capture the input to allow us to work with the expression and value separately
  q <- rlang::enquo(x)
  ...
}

Now, q holds the expression alongside with its evaluation environment. We can quickly transform it into a string by deparsing it:

deparsed_expression <- rlang::expr_deparse(rlang::quo_get_expr(q))

To get the value of the expression, we can enforce evaluation:

x <- rlang::eval_tidy(q)

Getting the precise location of the ic() call is more tricky. We can do it, however, with the usage of rlang functions for analyzing the stack of calls:

# this code is simplified and not guaranteed always to work

# inspecting the traceback and extracting the call stack:
trace <- rlang::trace_back()
call_stack <- trace$call

# getting length of the call stack:
num_calls <- length(call_stack)

# accessing the second-to-last call (to omit the `ic` call) and extracting parent reference:
parent_ref <- call_stack[[num_calls - 1]][[1]]

# extracting location from the reference:
ref <- attr(call_stack[[num_calls]], "srcref")
loc <- rlang:::src_loc(ref)

Notice that rlang:::src_loc() is not an exported function, as it is not meant for end users of the package. We allowed ourselves to borrow the code of the function as it is crucial for our package.

Plans for the ice cream industry and conclusion.

If you practice using print to debug your programs, do it with class. That is what icecream offers. Its main advantages are less typing, convenient information printing, the inclusion of context, easy disabling and customization. The package is still under development, and our current goal is to fully implement the original Python version’s functionalities. I encourage you to give it a try!


If you have any suggestions for the icecream, do not hesitate to leave an issue on our repo.