Hacker News new | past | comments | ask | show | jobs | submit login

I use a simply device to keep these straight:

- (l)apply: List apply always returns a list

- (s)apply: simplify apply tries to return simplified result

- (v)apply: verify apply checks the return type conforms to user supplied example

- (m)apply: multiple apply applies FUN to multiple vectors

- (r)apply: recursive apply is essentially a flatmap

- apply : no device here, only use on matrices, never data.frames




After reading your explanation I still, as has been the case for years, don't understand what sapply or rapply does, and vapply sounds weird.

That isn't going to change, because I'm just not going to use them or 'invest' the time in finding out what some statistician-of-yore's interpretation of a map is. Instead I'll stick to tidyverse map - returns a list. Or tidyverse map_[int/chr/dbl/etc, etc] if I want a vector of [int/chr/dbl/etc, etc].

That and the data frame manipulation verbs covers the most useful 80% of cases where *apply would otherwise be needed. If the base R team were implementing functions that way in the base, stats professors wouldn't need to complain about mass exoduses from base R.


Yeah, that's a valid point. And it took me a couple years to internalize these differences -- mostly by being burned on numerous occasions. Base R has a lot of idiosyncrasies and they have to be memorized, unfortunately.

I develop R packages for my colleagues and so I stick to Base R whenever possible. I don't want my packages depending on the tidyverse at all. But for EDA, I am agnostic about what my colleagues do. They should use the tools that stay out of the way and let them get their hands around the dataset intuitively. For me that's Base R, for others it's data.table or tidyverse.


> After reading your explanation I still, as has been the case for years, don't understand what sapply or rapply does,

There is nothing complicated about what sapply does... It simply means loop over the elements of the first argument (which must be a list; btw a dataframe, df, is internally the same as a list) and apply some function. lapply does this and returns a list, sapply does this and can optionally "simplify" the results into a vector, etc.

So:

lapply(df, class) = loop over the elements of df and tell me the class, return this in the form of a list

sapply(df, class) = loop over the elements of df and tell me the class, return this as a character vector

This is basically lapply:

  res = NULL
  for(i in 1:length(df)){
  res = append(res, class(df[i]))
  }
  return(res)


Love it or hate it, it's pretty clear these fns had very little api design thought go into them.

As much as I think people can lean too much on the concept, single responsibility principle would've gone a long way here.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: