OpenFloor #4: An Example for +/- Extension

Also some other stuff

Jul 13, 2025

Off-season in Europe is relatively calm compared to the NBA. On top of that, I’ve been focusing on extending my Bayesian skills so haven’t been working with EuroLeague data recently. So, I decided to follow up on my recent post.

One of my favorite extension of +/- is for eFG%: Which players help others shoot better when they share the floor (even when they’re not the passer)? As I pointed out in the previous post, some players help others since they have tremendous gravity — they attract the attention of the defense. Some others pass well, some others stretch the floor, and some have the combination of those. Unfortunately, we don’t capture each of these in a box-score: There isn’t a metric for “great floor stretching”.

The data matrix changes a little bit for this extension. In the classic RAPM, there are two ways to fit the model.

Each row corresponds to a possession.
Each row corresponds to a stint (time periods within a match with no substitutions).

In this version, each row will correspond to a single shot. On top of that, instead of having two columns for each player, now we need three columns: One for defense, one for offense, and one when he’s the shooter. So, offense column gets split into two here. Just to remind you, these columns are on/off switches. If a player is on the court and on the offense, his offensive column becomes 1 and vice versa for others. Anyway, this is a nice extension and it is possible to carry this “split the column into two” approach for other things as well. For example, you can split the RAPM type of data matrix into two, one for when a player is on the court with some high-usage teammate and one for when he isn’t.

I did some messy version of what I have mentioned above with the single season of data, and looking at the top players doesn’t surprise me: Tarık Biberovic, Niccolo Mannion, Rodrigue Beaubois, Marko Guduric, Facundo Campazzo, Sasha Vezenkov, Nikola Mirotic… This list passes the smell test. What I mean by that? Let’s say you made a model with NBA data to estimate impact. If your model does not put Jokic, SGA at the top, your model is probably off. For football, this is called Messi test, in basketball LeBron test, in baseball Trout test (baseball lovers, correct me if I’m wrong). So, when I specify a model like above, I expect to see players who have gravity, great passing ability to be among the top. Maybe not in the order that I have in mind, but I expect to see them when I eyeball at the top.

For example, you can do the same structure mentioned in the preceding section but for turnovers. Then, you should expect to see high usage players at the top — a player needs the ball to turn it over, but a guy like Mike James or Kevin Punter has the ball most of the time. If you’re playing near them, you don’t lose the ball that often since you don’t have it that often.

Anyway, I hope I conveyed the gist.

I took a glance at how much the bonus possessions in EuroLeague move the needle for the teams. Here’s the plot (shoutout to F5, I learned these from his tutorials back in the day).

I was actually expecting every team to show higher offensive rating but, that’s certainly not the case. I wonder if teams that do worse actually draw less fouls during the bonus, I’ll let you know if I check it in the future.

If you’re interested in data science and statistics outside of basketball/sports analytics, you can keep an eye on my website as well.

The Read Step

Discussion about this post