Functions over Idioms – Writing R in Python with rfuns
R-bloggers 2026-05-22
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
If you’ve read any of my past posts you know I like to program in severaldifferent languages, some of which I like more than others. Sometimes a problemcalls for a particular language to be used, and with that comes adjusting one’sbrain to thinking in that language and using the appropriate idioms to leveragethat language’s features. But what if I don’t want to?

The line between R and Python has been heavily blurred the last few years,particularly with {reticulate} enablingus to use Python within R code, RStudio rebranding as Positand taking on a strong Python development effort, releasingPositron as a multi-language IDE, andQuarto being a multi-language rethink of Rmarkdown.
I occasionally need to use Python directly – an SDK wrapping an API exists andI don’t particularly want to spend a lot of time writing my own R version,especially before I know what I want to get out of the endpoints. At this pointI tend to bump up against my muscle-memory from R and try to use functions I’mfamiliar with from R, but which don’t actually exist in Python. Now, that mightsometimes be because the pattern I’m trying to encode simply has a different namein Python; instead of an sapply(x, f)
sapply(c(2, 3, 4, 5), \(x) x ^ 2)## [1] 4 9 16 25
I should reach for map, in which case I am reminded that this produces a lazyiterator that doesn’t show me the results
map(lambda x: x ** 2, [2, 3, 4, 5])## <map object at 0x10d7fbee0>
and so I need to wrap it into a list to get the values out
list(map(lambda x: x ** 2, [2, 3, 4, 5]))## [4, 9, 16, 25]
Or, I could use a list comprehension which isn’t lazy
[v ** 2 for v in [2, 3, 4, 5]]## [4, 9, 16, 25]
That’s the idiom that I should be reaching for. Sure.
Other times there’s a package I need to use and a slightly different way ofapproaching the problem. In R I love the table() function for gettinghistogram-like counts of the unique values of a vector
table(c("b", "a", "c", "a", "b", "a"))## ## a b c ## 3 2 1which in Python looks like
from collections import Countersorted(Counter(["b", "a", "c", "a", "b", "a"]).items())## [('a', 3), ('b', 2), ('c', 1)]Probably Pythonistas remember that idiom and the package to import and the.items() extractor and the fact that they maybe want to sort the result. But Ikept coming back to a question I ask myself: what if I don’t want to? Why isthere not a function that wraps this idiom? If there was, why not just call it“table”? Admittedly, it’s far from the catchiest, most memorable, or most usefulname, but it’s immediately recognisable to an R user (ditto for “sapply”).
One approach I considered here was to just call R from Python. That can be done,but I doubt I or anyone else wants to deal with that every time we want to iterateover a list. There’s a package on the Python package index which seems to supportthis nicely: https://pypi.org/project/r-functions/ but it’s wrappers aroundindividual R files, via RScript. I’m thinking more along the lines of ‘nativePython with an R interface’.
Python is an object-oriented language, but it has functions, so why not make one
from collections import Counterdef table(x): return dict(sorted(Counter(x).items()))table(["b", "a", "c", "a", "b", "a"])## {'a': 3, 'b': 2, 'c': 1}def sapply(x, func): return [func(v) for v in x] sapply([2, 3, 4, 5], lambda x: x ** 2)## [4, 9, 16, 25]and have a nicer function interface to apply these idioms? I thought about thisa bit longer, and realised there’s lots of functions I use in R that I wishI could use in Python. An idiom for finding the index of elements of a ‘vector’(list in Python) which are true (TRUE in R, True in Python) is
[i for i, v in enumerate(x) if v]
but I just want to call which(x)
which(c(FALSE, FALSE, TRUE, FALSE , TRUE))## [1] 3 5
so why not define this
def which(x): return [i for i, v in enumerate(x) if v] which([False, False, True, False, True])## [2, 4]
(remembering that Python is 0-indexed).
How far could one take this? Quite a long way!
I thought more about what differences would need to be accounted for, and one thatimmediately came to mind was that R is vectorised. If I was to recreate R’scharacter counting function nchar(s) as essentially len(s), I’d need to considerwhether I wanted it to work on a single string or a ‘vector’ of strings
In R:
nchar(c("these", "all", "have", "different", "lengths"))## [1] 5 3 4 9 7But in Python, len() expects a single value, so it calculates the length ofthe list
len(["these", "all", "have", "different", "lengths"])## 5
The ‘proper’ way to do it is to map over the list
[len(s) for s in ["these", "all", "have", "different", "lengths"]]## [5, 3, 4, 9, 7]
but again, why do I need to use an idiom for this? What if I just made a decoratorto change a regular function to a vectorised one by applying this listcomprehension internally when it’s passed a list (or a tuple), and which otherwisejust evaluates the function with the argument?
import functoolsdef make_vec(func): @functools.wraps(func) def wrapper(*args, **kwargs): if isinstance(args[0], (list, tuple)): return [func(xi, *args[1:], **kwargs) for xi in args[0]] return func(*args, **kwargs) return wrapper@make_vecdef my_len(s): return len(s)my_len(["these", "all", "have", "different", "lengths"])## [5, 3, 4, 9, 7]
and I could name it… “nchar”!
The other use-case that came to mind was Elio venting(and referencing a post to which I also wrote a sort of response)that they needed to list the files in the current directory
Post by @eliocamp@mastodon.socialView on Mastodon
with the idiom
import os[os.path.join(path, f) for f in os.listdir(path)]
The supplied suggestions included
from pathlib import Pathlist(Path(path).iterdir())
(just rolls off the tongue, doesn’t it?) which returns a list of PosixPath()objects and is hardly easy to parse visually.
So, why not have a function?!?
import osdef list_files(path): return [os.path.join(path, f) for f in os.listdir(path)]path = "path/to/files"list_files(path)## ['path/to/files/file1.txt', 'path/to/files/file2.txt', 'path/to/files/file3.csv']
I would have liked to call this list.files() but, since Python strictly usesthe dot for method calling, it can’t be that.
This then raises the question of “should I support the arguments already in the Rfunctions?” In this case, should it support a recursive argument? Yes, thatadds complexity, but it’s surely do-able. At this point I reached for some AIassistance and had Claude help me to implement as many functions as we could thinkof, supporting as many common arguments as possible. This involved extending thedecorator to support vectorising other arguments (which also need to be carefulabout dots).
On testing it out, it looked like we had something viable.
One last piece I wanted to support, though: the which() example above extractsthe elements of a logical vector which are True, but in order to build that vectorin the first place, I would naturally leverage R’s vectorisation as an arraylanguage. The two steps involved here are to first compute the comparison resultingin a logical vector, then to use which() to identify the indices of those which aretrue
which(c("c", "b", "a", "c", "a", "b") == "a")## [1] 3 5The vectorisation decorator above doesn’t help here, because it’s at the point of== that we want to vectorise
['c', 'b', 'a', 'c', 'a', 'b'] == 'a'## False
This is False because the character 'a' is not equal to the given list.
The appropriate idiom is once again to use a list comprehension
which(x == 'a' for x in ['c', 'b', 'a', 'c', 'a', 'b'])## [2, 4]
The solution I’m fond of is to create a new ‘Vec’ class which wraps binary operatorswith a list comprehension, again abstracting away this detail. This meansimplementing __eq__, __add__, __and__ and lots of other binary operations,but with that, and a wrapper to create such an object, the comparison operatorscan be vectorised
vals = vec(['c', 'b', 'a', 'c', 'a', 'b'])which(vals == 'a')## [2, 4]
Not pristine, but quite clean, if you ask me.
With all these pieces in place, adding implementations for common base R functionsincluding most arguments and a way to vectorise lists, I wrapped everything upinto a Python package (my first) to learn how to do it.
The workflow isn’t particularly painful, with my biggest complication beingdifferent versions of Python supporting different requirements in pyproject.toml,and so some GitHub Actions are failing because of that.
As part of building out the implementations I had Claude add tests for each of thefunctions with some expected values – if I do want to improve some of the idiomsinternally, I want to ensure I don’t change the values produced. That works forhaving any testing at all, but how can I be sure that I’m reproducing what Iwould get if I was working in R? One option was to just run all of the testfunctions by hand and confirm that the values look similar enough, accounting forlist vs vector and 0 vs 1 indexing. Instead, Claude managed to write an adaptorfor pytest which does the realignment of e.g. list_files to list.files(and similarly for arguments), realigns the indexing where needed, and runs allexisting tests directly in R via rpy2 (skipping over some for which I don’thave tests yet). I’m disabling automated testing of this because I suspect itcould get flaky dealing with both R and Python on GitHub Actions, but I canconfirm that all the current tests pass.
I wanted to have a documentation website similar to what we have via {pkgdown} andcame across quartodoc which is what thePython version of {pins} uses. Gettingthat to work required downgrading a specific Python dependency, but was otherwisepainless.
I have a working package locally – how do I share it? This seemed like the perfectopportunity to learn what the release process looks like for Python. I have ahandful of packages on CRAN and one on Bioconductor, and the process there isfar from frictionless, with the side-effect that there’s some trust you can placeon the interoperability of packages and minimal (automated) code checking. WhilePython is more ‘wild west’ in terms of what can be uploaded, it’s really nice to seethat they do have an entirely separate test serverwhere you can upload your package and see how it looks. I’m reminded of the quote
Everybody has a testing environment. Some people are lucky enough to have a totally separate environment to run production in.
Given that it’s not currently possible to run 100% of the CRAN checks locally(and even some that you can give a different result to what’s on their systems)this does make me a little jealous. I wonder whether the decrease in load fromrejecting failing submissions would offset supporting a test submission server.
All went well pushing to the test server (via an authentication key) and I managedto build up the courage to push to the production instance…it’s live!

and the documentation site isn’t too bad,either (in my opinion).
This means that you can now run
uv add rfuns
(or the equivalent in whatever virtual environment management configuration you’reusing, e.g. pip install rfuns) and start using some R functions directly inPython!
Depending on how you like to manage your imports, you can import everything
from rfuns import *which([False, False, True, False, True])## [2, 4]
or, if you prefer to namespace
import rfuns as rr.which([False, False, True, False, True])## [2, 4]
The list of functions currently imported, grouped into sections is:
Strings
nchar(x)nzchar(x)paste(*args, sep=" ", collapse=None)paste0(*args, collapse=None)grepl(pattern, x, ignore_case=False, fixed=False)grep(pattern, x, ignore_case=False, fixed=False, value=False, invert=False)gsub(pattern, replacement, x, ignore_case=False, fixed=False)sub(pattern, replacement, x, ignore_case=False, fixed=False)trimws(x, which="both", whitespace=r"[ \t\r\n]")toupper(x)tolower(x)startsWith(x, prefix)endsWith(x, suffix)strsplit(x, split, fixed=False)substr(x, start, stop)chartr(old, new, x)formatC(x, digits=6, format="g", width=None)
Vectors
which(x)which_min(x)which_max(x)diff(x, lag=1)cumsum(x)cumprod(x)cummax(x)cummin(x)rev(x)duplicated(x)setdiff(x, y)intersect(x, y)union(x, y)unique(x)seq_along(x)seq_len(n)-
seq(from_=0, to=None, by=None, length_out=None)(fromis a reserved keyword) sign(x)-
r_range(x)(renamed to not conflict withrange())
Math
sign(x)trunc(x)ceiling(x)floor(x)sqrt(x)log(x, base=None)log2(x)log10(x)exp(x)abs(x)var(x, na_rm=False)sd(x, na_rm=False)mean(x, na_rm=False)median(x, na_rm=False)quantile(x, probs=None, na_rm=False)scale(x, center=True, scale_=True)round(x, digits=0)
Files
list_files(path=".", pattern=None, all_files=False, full_names=False, recursive=False, ignore_case=False, include_dirs=False, no_dot=False)file_exists(path)dir_exists(path)basename(path)dirname(path)file_path(*args)
Table
table(x)prop_table(x)margin_table(x)
Functional
lapply(x, func)sapply(x, func)vapply(x, func, expected_type)tapply(x, index, func)rapply(x, func)Filter(func, x)Map(func, *args)Reduce(func, x, init=None, accumulate=False)
Inspect
head(x, n=6)tail(x, n=6)length(x)nrow(x)ncol(x)dim(x)summary(x)-
rstr(x)(renamed to not conflict withstr())
Utils
vec(x)
Some of these are vectorised
nchar(["these", "all", "have", "different", "lengths"])## [5, 3, 4, 9, 7]grepl("ar", ["frog", "carpet", "basket", "dart"])## [False, True, False, True]sqrt([36, 81, 9])## [6.0, 9.0, 3.0]
while others (approximately, up to 0-indexing) preserve the R behaviour, such ashow seq() works
seq(5)## [0, 1, 2, 3, 4]seq(from_=0, to=10, by=2)## [0, 2, 4, 6, 8, 10]
(note that from is a keyword in Python, so the argument here is now from_)and set operations
setdiff([5, 2, 4, 1], [2, 1])## [5, 4]
whereas this does not preserve order
set([5, 2, 4, 1]) - set([2, 1])## {4, 5}
Doing all of this myself would have taken quite some time, so I’m grateful tobe able to direct an agent towards accomplishing some of the tedious parts of thisproject. I still drove the decision making and made sure to verify outputs, so Idon’t consider this a ‘vibe-coded’ project.
I’m not recommending you use this in production at all – I’ve taken whateveridiom I could find (or generate) for the internals of all of these, and haven’tpaid any attention to their performance. The goal was to make it easier for meto work interactively in a REPL when I’m reaching for particular functions. Thatbeing said, I’ll gladly do my best to understand the Pythonic versions as bestas I can so that I can better appreciate native Python and use the idioms whenmy helper package isn’t available (or unsuitable). I’d say it’s fair to arguethat R users using Python should learn how to do things in a Pythonic way, butI also just want to get some small things done occasionally, so I’m happy thisnow exists.
If you’re working with non-R colleagues then introducing these abstractions —while they may make your life simpler in the moment — will probably result inconfusion as you’re hiding away the implementation and giving it a name theywon’t recognise. That’s precisely what functions are for (with helpful names),of course, but unless this package becomes popular, I’ll bet that the inlineidioms are more welcomed in a codebase.
I’d love to hear what people think about this, although I’m entirely fine with mebeing the sole user of it. Should I just force my muscle-memory to take on thePython idioms? Am I going to be punished for ‘crossing thestreams’ of twoincompatible languages? Would this be helpful to you? Are there otherconsiderations I’ve missed? As always, I can be found onMastodon and the comment section below.
Shoutouts to Elio Campitelli and Michael Sumner for feedback on a draft of thispost.
devtools::session_info()## ─ Session info ───────────────────────────────────────────────────────────────## setting value## version R version 4.5.3 (2026-03-11)## os macOS Tahoe 26.3.1## system aarch64, darwin20## ui X11## language (EN)## collate en_US.UTF-8## ctype en_US.UTF-8## tz Australia/Adelaide## date 2026-05-22## pandoc 3.6.3 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown)## quarto 1.7.31 @ /usr/local/bin/quarto## ## ─ Packages ───────────────────────────────────────────────────────────────────## package * version date (UTC) lib source## blogdown 1.23 2026-01-18 [1] CRAN (R 4.5.2)## bookdown 0.46 2025-12-05 [1] CRAN (R 4.5.2)## bslib 0.10.0 2026-01-26 [1] CRAN (R 4.5.2)## cachem 1.1.0 2024-05-16 [1] CRAN (R 4.5.0)## cli 3.6.5 2025-04-23 [1] CRAN (R 4.5.0)## devtools 2.4.6 2025-10-03 [1] CRAN (R 4.5.0)## digest 0.6.39 2025-11-19 [1] CRAN (R 4.5.2)## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.5.0)## evaluate 1.0.5 2025-08-27 [1] CRAN (R 4.5.0)## fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.5.0)## fs 1.6.7 2026-03-06 [1] CRAN (R 4.5.2)## glue 1.8.1 2026-04-17 [1] CRAN (R 4.5.2)## htmltools 0.5.9 2025-12-04 [1] CRAN (R 4.5.2)## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.5.0)## jsonlite 2.0.0 2025-03-27 [1] CRAN (R 4.5.0)## knitr 1.51 2025-12-20 [1] CRAN (R 4.5.2)## lattice 0.22-9 2026-02-09 [1] CRAN (R 4.5.3)## lifecycle 1.0.5 2026-01-08 [1] CRAN (R 4.5.2)## magrittr 2.0.4 2025-09-12 [1] CRAN (R 4.5.0)## Matrix 1.7-4 2025-08-28 [1] CRAN (R 4.5.3)## memoise 2.0.1 2021-11-26 [1] CRAN (R 4.5.0)## otel 0.2.0 2025-08-29 [1] CRAN (R 4.5.0)## pkgbuild 1.4.8 2025-05-26 [1] CRAN (R 4.5.0)## pkgload 1.5.0 2026-02-03 [1] CRAN (R 4.5.2)## png 0.1-9 2026-03-15 [1] CRAN (R 4.5.2)## purrr 1.2.2 2026-04-10 [1] CRAN (R 4.5.2)## R6 2.6.1 2025-02-15 [1] CRAN (R 4.5.0)## Rcpp 1.1.1 2026-01-10 [1] CRAN (R 4.5.2)## remotes 2.5.0 2024-03-17 [1] CRAN (R 4.5.0)## reticulate 1.45.0 2026-02-13 [1] CRAN (R 4.5.2)## rlang 1.1.7 2026-01-09 [1] CRAN (R 4.5.2)## rmarkdown 2.30 2025-09-28 [1] CRAN (R 4.5.0)## rstudioapi 0.18.0 2026-01-16 [1] CRAN (R 4.5.2)## sass 0.4.10 2025-04-11 [1] CRAN (R 4.5.0)## sessioninfo 1.2.3 2025-02-05 [1] CRAN (R 4.5.0)## usethis 3.2.1 2025-09-06 [1] CRAN (R 4.5.0)## vctrs 0.7.1 2026-01-23 [1] CRAN (R 4.5.2)## withr 3.0.2 2024-10-28 [1] CRAN (R 4.5.0)## xfun 0.56 2026-01-18 [1] CRAN (R 4.5.2)## yaml 2.3.12 2025-12-10 [1] CRAN (R 4.5.2)## ## [1] /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library## ## ─ Python configuration ───────────────────────────────────────────────────────## python: /Users/jono/.cache/uv/archive-v0/Q1veGTfRq3GBaNYBXjagV/bin/python## libpython: /Users/jono/.local/share/uv/python/cpython-3.12.12-macos-aarch64-none/lib/libpython3.12.dylib## pythonhome: /Users/jono/.cache/uv/archive-v0/Q1veGTfRq3GBaNYBXjagV:/Users/jono/.cache/uv/archive-v0/Q1veGTfRq3GBaNYBXjagV## virtualenv: /Users/jono/.cache/uv/archive-v0/Q1veGTfRq3GBaNYBXjagV/bin/activate_this.py## version: 3.12.12 (main, Oct 28 2025, 11:52:25) [Clang 20.1.4 ]## numpy: /Users/jono/.cache/uv/archive-v0/Q1veGTfRq3GBaNYBXjagV/lib/python3.12/site-packages/numpy## numpy_version: 2.4.6## ## NOTE: Python version was forced by VIRTUAL_ENV## ## ──────────────────────────────────────────────────────────────────────────────
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.