One of the common questions we’re seeing is about package weights, and I thought it’d be worth starting a thread for ideas and feelings on this.
A reminder that the “weight” is one of the factors in our payment formula, which is summarized here: https://tidelift.com/docs/lifting/paying
The overall effect of the formula is that both small, popular packages and large, niche packages can earn money on Tidelift; large, popular packages of course earn the most money.
I’m feeling great about the “usage” part of the formula - this is the royalty-like aspect, if a project builds and maintains a package that subscribers actually use, then they get paid in proportion to their success. But if I just upload some code to npm and nobody but me uses it, I get nothing.
The “weight” part is trickier. I thought I’d give some behind-the-scenes on what we’re doing right now and why, and leave this thread open a while for feedback and ideas.
There are some downsides to weighting by code size, but on balance we think it’s better than equal-weighting - a 2-line function and a huge framework simply shouldn’t get the same portion of fees because they aren’t the same amount of work or creating the same impact. We are mitigating the downsides in these ways:
- The formula also incorporates usage; if you make a package that’s full of junky code, someone can make a forked version without the junk and take away your usage.
- We adjust the code size measurement to remove things like generated code, vendored code, boilerplate, and tests.
- We adjust the code size measurement to make different languages comparable.
- We will actively shut down any identifiable attempts to game this, and since gaming it hurts other maintainers in your ecosystem, we expect that gaming attempts are likely to be reported. Also, we think most OSS maintainers are ethical.
- We hope to incorporate some other measures into the weight over time.
Any way we split things up will be a little arbitrary and a little game-able.
We think “adjusted code size” has some virtues such as being fairly objective and having some relationship to maintenance effort. It’s better in our minds than equal-weighted for those reasons.
But better doesn’t mean perfect; very open to suggestions on how we should evolve the weight factor or what else should go into it.
Rejected approach: equal weighting
The simplest approach to weight is equal-weighting (which is the same as “don’t have a weight factor, only consider usage”). The issues with this include:
- a very strong incentive to split up a package into many small packages
- the intuition that a huge framework and a 2-line function are not the same amount of maintenance effort or the same amount of value-to-subscribers
The issue with bad incentives isn’t only that people might game Tidelift on purpose, it’s also that in the wild people have already sometimes split things up and sometimes haven’t, for technical or practical reasons.
Rejected approach: size of the entire package
The next-simplest approach we came up with was “the size of the package” (like “make an http HEAD request to the package’s download URL and get the Content-Length”). The issues with this include:
- some packages include N copies of their code, like a regular version, minified version, minified-a-different-way version, etc.
- some packages include various data files, test files, vendored code, autogenerated code, etc.
- sensitive to level of gzip or zip applied to the package
- includes package manager metadata which means a slight gain from splitting up packages still
I actually tried this since it seemed like the simplest thing that could possibly work, and it did not work. The results were not good. For example, I was surprised how many packages ship their unit tests, including massive fixtures sometimes, right in the released package!
Current approach: adjusted code size
Getting a little more complex is an “adjusted code size.” What this means is that we unpack the package (removing compression), filter out files that aren’t code, filter out various kinds of code that shouldn’t count (like vendored dependencies), and then add up the sizes of the remaining code files.
This is what we’re doing now and the results feel pretty good; the packages at the top of the weightings are substantive packages with a lot of maintenance work going into them. Splitting up packages into smaller packages ought to have no effect, total weight would remain the same.
We also do some normalization by ecosystem to make npm and Java more comparable (since many subscribers are using multiple ecosystems).
Conceptually, the weight indicates how much relative value each package provides to a single subscriber. Usage then considers how many subscribers are receiving that value.
We could incorporate some signal from subscribers of “how much I care about this package” into the weight number. I don’t think it’s a good idea to let subscribers completely pick-and-choose which packages get what, because if they get no value from something, why are they using it? We want to lift all boats. Also, no subscriber wants to micromanage weights on 3000 packages.
But perhaps there are ways for them to say “I really really care about package xyz” and factor that in, and we’ve heard the desire to do so from them.
I tend to think we should avoid anything in the weight that’s redundant with usage. For example, download counts or GitHub stars or other popularity measures. I might expect these to correlate with usage, so pulling them in might double-count the same factor.
If people do start to game things, it could work out a lot like Google’s search algorithm or spam filtering algorithms, where we keep having to adapt. However, the absolute numbers of “packages people actually use” are a lot lower than the number of web pages or spam emails on the Internet, so manual-intervention solutions are more practical. We also have a business relationship and contract with all lifters, which helps.
By the way: if we do change the weighting algorithm in the future, mitigating the impact on lifters will be an important consideration. There are several ways to do that so we don’t pull the rug out from under anyone.
Feelings and ideas welcome
We can definitely make changes and evolve things from here.