Package weights

The simplest idea I’ve come up with is to ask subscribers: “please tell us which dependencies are most critical to you” and allow subscribers to put a check by some of their deps. I think we’d want to user test this some and see if it makes sense to anyone.

I’d love to have some more automated way to get at this without someone having to do work - but it’s tough to think of what it’d be.

I would have chosen a type-weighting approach.

You would have different tiers:

  1. programming language / compiler
  2. framework
  3. library
  4. polyfill
  5. helper function

I mainly provide 2, 3 and 4 and would completely understand if the maintainers in the top tiers (0 and 1) get more.

Interesting! That’s a new idea. Can think of some challenging aspects of it. But there’s intuitive appeal that the things at the higher tiers feel more “central” so this could be a useful factor.

I’m not sure tiers will add much to the scheme you already have. A programming language should only get more funding if it is widely used, and you’ve already got the breadth of use as a factor. Should a little-used esoteric programming language get funding just because it’s a programming language?

I think the number of users and the size of the project will already pay programming languages well without adding an extra factor based on the kind of software it is.

I am talking about the ratio effort/retribution here: i.e. whether the programming language is popular doesn’t matter to me in that regard. There are frameworks based on a language making more money than the language creator himself, how’s that fair?

If the value is calculated as “size of code” * “number of users”, then a language should make more than any framework it’s based on, no?

Or perhaps the thing to acknowledge here is that FrameworkX uses LanguageX, so some of the calculated value for FrameworkX should be re-directed to LanguageX?

So untested code has the same “weight” as code with tests (that have to be written and maintained)?

@mbrookes It seems like you’re drawing a conclusion from something someone here has said, but I’m not sure from what.

Remember that usage is also a factor (and hopefully, over time, a referendum on quality and utility). My hope/belief is that many activities such as documentation, testing, patch review, etc. are reflected in increased usage.

A practical reason to exclude tests is that we’re weighing the shipped package, not the source repo. (Many repos publish multiple packages.)

Most packages don’t ship the tests; a surprising number I’ve looked at do, but it kinda seems like they do it accidentally (the consumer of the package afaik doesn’t import or use the tests).

So that’s why I’ve currently adjusted the tests out of those packages that seem to accidentally ship them, so weights are comparable between packages that do and don’t bundle their tests.

so some of the calculated value for FrameworkX should be re-directed to LanguageX?

It should if it’s accepting donations, yes.

Package weight is a subjective measure depending on value of the package in specific project. For cases when people can see it, this personal measure should be a priority. In my opinion. Every person on a project may have a package that he or she likes and wants to support, and people need to be given this ability. This makes Tidelift a more social story.

This almost fits Future ideas section if “micromanagement” is replaced with “let people play”. It could be your personal preferences, but if company hires you, they import your settings and may or may not distribute their weights accordingly. This makes you valuable artifact in you community regardless if you can participate or busy with a new job.

Speaking about gameplay. In agile development people are playing agile story points, so let them play infrastructure points to estimate the impact.

Subjective feeling of a project well-being is important as well. If one project has sufficient funds, then another might find them more important, but that’s a source of speculations, so unless people know each other personally, it may not work out well.

1 Like

Something I’d emphasize as background that relates to some of this discussion, is that we are paying lifters to do work and to be on the hook for certain things; Tidelift is not a donation system, it’s a system for maintainers to collaboratively provide valuable benefits to subscribers. (See also Product roadmap snapshot as of January 2019 and )

The way we frame it currently is that subscribers are covered for all packages they report that they use ( see ) so if they’re reporting it, they are getting the subscription benefits.

A related point is that subscription benefits and paying lifters are linked. So for example we don’t have a way to sign up to lift C packages right now, but we also don’t have a way for subscribers to get subscription benefits on C packages. In the current model, we’d want to add both of those at once.

This would be great, for both subscribers and lifters.

As a subscriber, I think I’d like something very simple like:

Pick 3 packages that, for any good reason, deserves an extra payout.

Let’s say 5% of Tidelift’s total subscriptions earnings (after Tidelift cut) every month will be paid out to the most highly favored packages in the open source ecosystem.

A subscriber’s 3 picks are essentially votes that decide the cut a given package gets from this bonus-payout, if any (there would obviously need to be a minimum). There could also be a cap on the max payout, so that even if a certain package got 46% of all votes, it can at most get a 10% piece of the bonus-payout.

Votes should be transparent ahead of payout. Not who (the companies) voted on what, but the number of votes different packages have.

As a lifter, this gives me an incentive to optimize for value in my package(s), as well as advertising my presence on Tidelift.

I wonder how many subscribers are geuninely aware of how important all their dependencies are? Lots of large code bases have components that are used that the subscriber never actually sees, but are important, and need supporting just as much as components that they interface with directly.

1 Like

It’d probably also be good for maintainers to weight their own dependencies versus their third party code, on a given package - on some packages, my transitive deps don’t really matter; on others, they’re the bulk of the work and should get the bulk of the funds.

I’m not sure that we should reward a package based on it’s size for a few reasons:

  • It does not reflect how useful the package is
  • I could just avoid minifying my code

About this last point, how do you handle it? Would it make any difference?

I minify all my projects with Webpack+Babel to release a package with three versions:

  • A “browser”: minified bundled with all dependencies and polyfills for the browser [umd]
  • A “module”: minified non-bundled with the few polyfills that are missing for node >= 12 [esm]
  • A “main”: minified non-bundled with the polyfills that are missing for for node >= 8 [cjs]

I would like to see an analysis of different (crossed) metrics like stars on github, number of downloads, forks, dependents, which mean potentially more customers for Tidelift? I prefer the point-of-interest-metric approach because it would directly benefit good project ideas.

Test code should count toward weight, because the more thorough the test suite, the more work tends to have been put into both building the test suite and fixing the bugs it caught.

1 Like

Test code should count toward weight

This is a valuable point that I mostly agree with, but am only starting to think through the implications of.

Based on the metrics I’ve recorded on projects I’ve worked on, a well-tested project will commonly consist of 75% or 80% tests (by line code.) So if we did this, well-tested projects would be weighted four or five times larger than untested projects.

Although I am a huge fan of hardcore testing and TDD, to me, this seems unfair. A project might, in theory, use no tests because it has some other (hypothetical) means of maintaining good fit-for-use, internal design, maintainability and quality.

If the untested project were not as good as a well-tested project, then users of that project should be the ones to judge, by selecting among competitors.

In essence, I’m saying that if a poorly tested project, which will presumably therefore have less useful functionality, be harder to maintain, and have more bugs, will be weighted down automatically by being used by fewer subscribers. We don’t need to additionally weight it down by code size.

On the other hand, if an untested project manages to somehow still provide useful functionality, be maintainable and responsive to new requirements, and few bugs, then it deserves a full share, rather than being artificially penalized for the methods it used to get there.

(In practice, I think it’s unlikely an untested project would be able to do this. As an industry we haven’t found a practice that is as good as good tests for these purposes. But the above seems right to me in principle. Even though I’m a test zealot.)

I’m not sure that we should reward a package based on it’s size

This is an interesting idea. I’m trying to think through the consequences for the ecosystem from a ‘economics’ kind of viewpoint.

If we don’t reward based on weight, then Lifters will work more on highly-valued projects that are small. This might be a sensible allocation of resources. But does it mean that larger projects would disproportionately not get tackled?

Could an argument in favor of not weighting by size be that such large projects shouldn’t get tackled? They are too much effort to justify the effort that gets spent on them? But, to make that concrete, what of projects like Django or Postgres? I don’t think it makes sense to say that ubuquitous projects like that are not worth the effort. Which suggests weighting by size is sensible in that it helps the community commit to things that are worthwhile.

A way to think about minification and tests might be to go back to some of the principles we’ve had in mind… some of those that feel relevant to me, along with how they’ve gone into current minification/test treatment:

  • Usage by subscribers is also part of the formula, and it’s intended to be the primary measure of “merit” (merit here means “enterprise customers want to use the package,” so we aren’t talking about a global measure of value to society, just a measure of value to the people who are paying).

  • Since the total share of income comes from “Usage * Weight,” anything in “Weight” which correlates with “Usage” gets squared and the power law on it gets steepened. So for example if the weight were based on github stars, maybe that correlates somewhat with usage, and now the very most popular packages get a squared share at the expense of long-tail packages.

  • We’re weighting packages not repositories. (Some repos publish lots of packages.) The available inputs are things that are in the package archive, which for example often doesn’t include tests, but often does include multiple “builds” of the same source, or other autogenerated code.

  • Comparability is important. We’ve been normalizing in part by ignoring the minified versions (if a package contains multiple builds of the same code, prefer to include only the unminified one). Ignoring tests is also part of this, because not all packages bundle the tests (I would say most don’t).

  • Making it Tidelift’s job to decide what counts as good development practices makes me anxious. If one project wants to use tests and another wants to use formal proofs, does either project really want Tidelift to come in and adjudicate that? I’d lean toward having us ask for the outcomes that enterprise customers want, instead of telling projects how to reach those outcomes. There’s some nuance in here for sure rather than a super bright line, but hope y’all see what I mean.

The reason the weight exists is that it substantially offsets a couple of ugly effects of using usage alone:

  • incentive to split packages up into tiny packages
  • popular package containing 3 lines of code is paid the same as a gigantic framework, even though the work to provide enterprise assurances on the gigantic framework is far more