Opened 3 years ago

Closed 8 months ago

#24707 closed enhancement (fixed)

Define R coding guidelines for Metrics' products

Reported by: iwakeh Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics Version:
Severity: Normal Keywords:
Cc: Actual Points: 0.1
Parent ID: Points:
Reviewer: Sponsor:

Description

Look for useful style guides elsewhere, if nothing comes up at least define basic rules in common with other languages used in Metrics' products.

For a start:

  • line length max 80 characters
  • indentation as in Java
  • line breaks _before_ operators (as in Java)
  • ...

Child Tickets

Change History (11)

comment:1 Changed 3 years ago by iwakeh

Example for line breaks here.

comment:2 Changed 3 years ago by karsten

Good idea to define some guidelines. Ideally, we'd even have a checkstyle thingy for them.

Here's a random example from "R for Data Science", which I guess is The R Book that we should not ignore:

# Compute rate per 10,000
table1 %>% 
  mutate(rate = cases / population * 10000)

# Compute cases per year
table1 %>% 
  count(year, wt = cases)

# Visualise changes over time
library(ggplot2)
ggplot(table1, aes(year, cases)) + 
  geom_line(aes(group = country), colour = "grey50") + 
  geom_point(aes(colour = country))

https://github.com/hadley/r4ds/blob/master/tidy.Rmd

comment:3 Changed 3 years ago by iwakeh

From a recent R tuning ticket and commit:

Performing pre-processing separately, helping R by defining read types, and avoid multiple casting operations.

So, always use colClasses parameter in read.csv and watch out for implicit casts.

comment:4 Changed 3 years ago by iwakeh

Stumbled upon some R style guides:

comment:5 in reply to:  4 ; Changed 3 years ago by karsten

Replying to iwakeh:

Stumbled upon some R style guides:

On first sight this looks best. I installed the lintr package and it showed me useful output.

Might make sense to look into this more, because we're also using Google's Java style. But we need a tool.

The part where I stopped reading was "Assignment operator: Use = instead of <- for assignments." Sorry, you're out.

comment:6 in reply to:  5 Changed 3 years ago by iwakeh

Replying to karsten:

Thanks for looking!

Replying to iwakeh:

Stumbled upon some R style guides:

On first sight this looks best. I installed the lintr package and it showed me useful output.

Yes, I think so too. Will try the lintr package.

Might make sense to look into this more, because we're also using Google's Java style. But we need a tool.

I think it is very terse almost seeming unfinished.

The part where I stopped reading was "Assignment operator: Use = instead of <- for assignments." Sorry, you're out.

True, the R-majority uses <-, but at least the link quoted there is helpful.

comment:7 Changed 3 years ago by iwakeh

Regarding licensing: tidyverse is GPL3; I couldn't find anything on the styleguide itself.

comment:8 Changed 3 years ago by karsten

I just finished reading the tidyverse style guide, and I like it. I'd say unless you find anything speaking against it or for another guide/tool, let's use it.

(Note that we wouldn't use the full tidyverse package just yet. It's not fully packaged in Debian. So far we use ggplot2, tidyr, and dplyr. In the future, I'd like to switch to readr.)

comment:9 Changed 3 years ago by iwakeh

Ok, so it seems there is convergence around the tidyverse style guide. As the style guide linked in comment:5 is hard to copy or download (at least I couldn't find the repo or link) and lintr is configurable and its documentation points to a different resource by the same author (Hadley Wickham) I'd suggest to use Wickham's guide, which is under cc license, in the same way we used Google's for Java, i.e., copy it to a wiki page, state additional points or style's where we diverge, and also provide a lintr collection of linters that enforce our style choices.

Next steps:

  • add R style guide to wiki
  • define additions and diversions (ongoing process)
  • provide an Ant task for apllying lintr in the way we apply checkstyle checks

What is missing?

Last edited 3 years ago by iwakeh (previous) (diff)

comment:10 Changed 3 years ago by karsten

Sounds like a fine plan!

comment:11 Changed 8 months ago by karsten

Actual Points: 0.1
Resolution: fixed
Status: newclosed

I simplified this approach a bit by first re-running lintr and resolving a couple remaining style issues and then leaving a comment at the start of our single R source file with instructions for re-doing this after making non-trivial changes. This should be sufficient for us. Closing.

Note: See TracTickets for help on using tickets.