Cargo crate name reservation spam

Recently I was going through crates.io recents to see if there were any libraries that caught my eye as something I could use to write something cool. I noticed that there was a large number of crates created at the exact same time, with the exact same description, all with version 0.0.0 and a message "WIP. Contact me if you want to use this name!". Here's a screenshot of this.

the user who's reserved all of these names is https://crates.io/users/swmon. I went through his profile and it seems as though every one of his crates on the 11 pages of his profile is just a "reserved name", bar one or two. I find it very hard to believe that one person could have so many "work in progress" projects that would require names as broad as "while", "youtube", "thef***", etc.

Can these crates be removed to free up the names, and perhaps consequences imposed on people who mass reserve common names they're never going to use like this?

I know it's too late unless we introduce a "legacy" namespace, but I wish Cargo had gone with namespaced packages, with a reserved / curated top-level namespace, taking the place of rust-lang-nursery and augmented with well-recognized and supported community projects. There could even be a formal auditing process for these packages, code / documentation / licensing standards, etc.

The top-level packages would be the first place a new user would look for things, and the namespaces would make it easy for users to find packages related to a specific project, while the presence of namespaces would make clear to the user "there are no guarantees for this package".

You could make it optional. If there is only one package with that name, you can omit it (but cargo still saves the full name in Cargo.lock so if there is one added lated you project will still compile), but if there are multiple, cargo will throw an error and list the possibilities.

Why is it harder to remember an author's name than a "creative" crate name? The goal of flat-namespacing was explicitly stated to be encouraging people to come up with random names like "nokogiri". Maybe that hasn't happened so much yet on crates.io, but we do have stuff like a crate named "reqwest" solely because "request" and "requests" were taken.

Good question. I'd guess that remembering short creative name is easier than remembering nick of a person (likely based on foreign name) + (possibly creative) name.

As an experiment, here are crates and authors I remember (I don't check whether they are correct, not spelling of the names, so you can see how many mistakes I do):

serde_json by dtolnay

I think serde is by dtolnay too, but I'm unsure

futures by alexcrichton

toml by alexcrichton

mio by carllerche

bytes by carllerche

ripgrep by burntsushi - but that one isn't a library

csv by burntsushi (very unsure about this one)

rtfm by japaric

copper by japaric

m4 by japaric (not sure about crate name)

embedded_hal by japaric (or some other member of embedded wg?)

syslog by geal

simple_server by steveklabnik

And here are crates with creative names (name not obviously associated with what it does) that I can remember:

tokio - async framework

iron - web framework

gotham - web framework

rocket - web framework

hyper - http parser

pest - parser framework

nom - parser framework

rtfm - real time for the masses - real time embedded framework

copper - embedded framework

piston - game engine

conduit - servie mesh framework, I think

clap - command line argument parsing

At the first sight it might look like username + crate name wins, but that's not actually the case. First I think that I don't use many crates with creative name, so the second list isn't shorter because of my inability to remember, but because of them not being many. Secondly, in the zeroth case I'd be (theoretically) unable to use any other crate. However, I remember both creative names listed above and "normal" names - many of them listed in the zeroth case, but there are others (websocket, protobuf, reqwest, serde, etc).

In other words, I remember more crate names than crate names and their authors. Of course, it's anecdotal evidence, but I'd be surprised if anyone remembered the nick of the author of every crate he uses.

As burkadurka suggests, I think the situation could be much better even for the crates you remembered. Consider Iron, for instance. Iron is actually a large, multi-crate project, and just remembering "iron" is probably not enough for you to get to work on a real project.

Here are the Crates.io links for every Iron crate Iron lists on their official GitHub page:

Every single one of those crates is Iron-specific, despite what I'd guess most newcomers would assume from the names. And if you were searching for "iron" on Crates.io, the first of these packages you'd find with the default "Relevance" sort order world be "iron-sessionstorage" on the second page of results, and then "urlencoded" on the fourth. The abandoned "iron-params" also shows up on the fourth page, before the actual crate for iron, named simply "params", which appears on the fifth. Third-party crates for iron are named inconsistently, mostly prefixed with "iron-" or "iron_", though sometimes other variations. There's absolutely no way to tell if an Iron-related crate on the Crates.io search results page is an "official" Iron crate without opening the result, and even on the individual crate page, the only real evidence a crate is "official" is hovering over the "Repository" link and confirming it is prefixed with "github.com/iron".

I don't think this is a good situation, and there are many other examples of this issue besides Iron. Namespaces would fix nearly all of this (overly generic names for very specific packages, hard to find related crates for project, worse quality results occur before high-quality ones due to string matching, no way to tell the difference between official and unofficial crates at a glance), including third-party crate naming consistency if there were "open" namespaces, so iron could keep "iron/" closed and "iron-community/" open.

I agree that parking could still be an issue, but I think it'd be much less of one. It's easier to moderate, in my opinion. There are many credible reasons for someone to need hundreds of packages, but there are very few credible reasons for someone to need hundreds of namespaces.

Also, with namespaces the answer can always be "more namespaces". This is more complicated than simply "Cargo and Crates.io should support namespaces", but you could imagine GitHub organizations / usernames or even domain names (think Java) acting as "reserved namespaces". "github/[username]/*" would be reserved for that user, while "[username]/*" would be reserved for a Crates.io user or organization. This system could also be used for integrating "custom package repositories" in a syntactically elegant way (ie, you'd mount your repository to whatever namespace you want). As long as you don't allow users to publish packages containing "/" or nest namespaces, then there's no need for the Cargo team to reserve "github", "bitbucket", or predict other top-level namespaces they might want to create, either.