from the relax,-be-happy dept

One of the most important moments in the rise of a radical idea is when the fightback begins, because it signals an acceptance by the establishment that the challenger is a real threat. That moment has certainly arrived for open access, most obviously through moves like the Research Works Act, which would have cut off open access to research funded by the US government. That attack soon stalled, but the sniping at open access and its underlying model of free distribution has continued.

There is a persistent conceit stemming from the IT arrogance we continue to see around us, but it's one that most IT professionals are finding real problems with -- the notion that storing and distributing digital goods is a trivial, simple matter, adds nothing to their cost, and can be effectively done by amateurs.

As a result, he thinks, there is "a consistent theme among dew-eyed idealists about publishing -- that digital goods are infinitely reproducible at no marginal cost, and therefore can be priced at the rock-bottom price of 'free'."

Well, they're certainly "infinitely reproducible", but nobody seriously claims that can be done at zero marginal cost. It is, however, extremely small. Indeed, in another post, Anderson himself provides a rough estimate for one part of the cost -- the online delivery of a 1Mbyte file: $0.001. It's true that delivering millions of copies would represent a more significant sum, but that ignores things like BitTorrent, which effectively shares the cost of distributing digital goods among many downloaders. Using such P2P delivery systems, the cost to the publisher really is vanishingly small.

But Anderson thinks there are other issues:

Even beyond just their power requirements, digital goods have particular traits that make them difficult to store effectively, challenging to distribute well, and much more effective when handled by paid professionals.

According to the source referenced, "the difference between an empty e-reader and a full one is just one attogram" -- a million-trillionth of a gram. Even with "hundreds of thousands of Kindles in the field," that extra fraction of a gram spread around the world is hardly going to be a major problem. But leaving aside the issue of weight, it's certainly true that this data takes up space on storage media:

The proliferation of digital goods -- photos, music, Web pages, blog posts, social media shares, tweets, ratings, movies and videos, and so much more -- puts incredible and growing pressure on metadata management techniques and layers. This means building more and larger warehouses, which adds to both ongoing costs for current users and migration costs as older warehouses are outstripped by new demands. Megabytes become gigabytes become terrabytes become zettabytes and beyond. Where will they all fit?

One answer is "in your pocket:" according to Amazon, a 1 terabyte portable hard disc currently costs around $100. Yes, a zettabyte might be a little more pricey, but judging by this recent large-scale, real-life project, we're still in the sub-petabyte era, so storing all this data isn't really going to require a warehouse -- a few rack systems should suffice.

But independently of where you are going to put it, another question is: Where is all that important metadata going to come from? As Anderson rightly says:

Creating, updating, and tracking the metadata is a chore for owners of digital goods. Poor metadata -- like a photo name off your digital camera of DX0023 -- can make the photo hard to find or use. Better metadata -- usually applied by humans, like "Rose in bloom, August 2006" for that elusive photo -- makes more sense.

Each tweet is a JSON file, containing an immense amount of metadata in addition to the contents of the tweet itself: date and time, number of followers, account creation date, geodata, and so on.

That is, the data comes with "an immense amount of metadata" automatically, because of the way Twitter (wisely) designed its system. And even for datasets that require metadata to be applied by hand, crowdsourcing is proving an efficient and low-cost way of providing it.

Other issues raised by Anderson are that digital goods need to be backed up, and secure, but that's hardly rocket science: open source solutions that cost nothing to acquire (but not to run, obviously) have been around for years. His main concern, however, seems to be about the physical infrastructure required:

Digital warehouses are more expensive to build. Site planning is a major undertaking. A physical warehouse is something a small business owner can buy and construct with relative ease. They aren’t expensive (a concrete pad, a sheet metal structure, some crude HVAC, and a security system is usually all it takes). A digital warehouse is expensive to construct -- servers, site planning, redundant power requirements, high-grade HVAC, earthquake-proofing, and so forth. This means that digital goods have to work off a much higher fixed warehouse cost.

It seems unlikely that it is cheaper to build a typical physical warehouse than to install a typical LAMP stack on rented commodity servers in a few different geographical locations (or in the cloud) to provide resiliency and backups. This exposes the central problem with Anderson's argument about the amount of data that must be handled, and the necessity for huge and expensive infrastructure to handle it: he seems to be lumping together very different kinds of digital data.

In the realm of digital goods, we’re reaching a point at which we’re facing trade-offs. Already, some data sets are propagating at a rate that exceeds Moore’s Law, which may still accurately predict our ability to expand capacity. And these are purposeful data sets. As data becomes an effect of just living -- traffic monitoring software, GPS outputs, tweets, reviews, star ratings, emails, blog posts, song recommendations, text messages -- we as a collective will easily outstrip Moore’s Law with our data. If there’s no place to put it, and nobody to manage it, does it exist?

Yes, genomic data is spewing out of DNA sequencers at an incredible rate; yes, the Large Hadron Collider produces almost unimaginable quantities of data. But these are exceptions: nobody is talking about letting the general public access this stuff in the same way that they can download media files, say. As I've pointed out in a previous post, we are fast approaching the point where we could store every Spotify track on a single hard disc, and the same will soon be true for every film, book -- and academic article.

For the latter, despite Anderson's title, it really is the case that storing and sharing them is nearly free, pretty easy and mostly trivial, which is why open access makes sense and is constantly gaining ground. The sooner traditional publishers stop fearing and fighting this trend, the sooner they can embrace and enjoy the possibilities this new abundance opens up for them.

Action: Have the publisher of an orthopedic surgery medical journal write an article about open source software and digital goods storage.

Result: Guy writes that Kindles are going to weigh more because of all the digital goods inside them.

Maybe an editor at the Society for Scholarly Publishing misheard "open-source software" as "orthopedic surgery", and things just went downhill from there? I can't think of any other reason for sending in an amateur instead of a trained professional, anyway...

Tangibility and costs

TLDR: His argument about the tangibility and costs of digital goods being nonzero are technically correct, but they are so incredibly small that TechDirt and rational analysis win once more over irrational defenses of legacy business models. Hooray!

Re:

I just wonder why all of these buggywhip producers, like the -AAs and the dead tree publishers keep fighting the obvious and inevitable - they NEVER win. Technology changes, constantly. The purveyors of the old, obsolete technologies either adapt and change, or die (usually noisily with much thrashing of everything in sight, the Constitution and human rights be damned - we're talking about PROFITS here!).

Re:

I love how, in your attempt to find a flaw to point out, you ignore the fact that Glyn was writing in response to an article that didn't mention anything about the costs of producing the goods to start with.

Re: archive.org

Re:

... given that the entire article had NOTHING TO DO WITH THAT, of course he did.

it was about warehousing and distribution. which are marginal costs so far as the product the consumer buys is concerned (even if setting them up is a fixed cost). MAKING the thing is a fixed cost and, as mentioned every other bloody time this comes up, is IRRELEVANT to the price per item to the consumer. it is the amount your over all PROFITS per item have to overcome for the product line/business to have been worth the effort. it's a different layer of calculation. there's per item profit (sale price per item - marginal cost per item) and then there's your product's profit ( (per item profit * items sold) - fixed cost of product ) then there's your business's profit (though i'm not sure if the last two are usually separated) which is the total of the profits and losses of all products less whatever other expenses your business incurs in it's running (overhead i guess? taxes? whatever.)

they're all numbers. they're all money going in and out. they're all Different Things.

Techdirt is, generally speaking, talking about the First one. morons like You keep trying to claim that somehow magically translates into the second and by implication the third. the only one that MATTERS is the third. the second is relevant to that, but if one thing makes a loss on the second level to allow another thing to make a greater Gain on that level the third level increases.

the constant attempt to apply a fixed percentage of the second level's costs as an excuse for the excessively high prices per infinitely (or near) reproduce-able item at the first level leads to over-pricing dropping the number of sales per item dropping the income per product but not the cost, thus dropping the profit and dropping the profit of the over all business, which is the one you should CARE about.

Re:

After the production is completed there are no relevant costs to it is there?

A space faring ship used to cost billions of dollars to design, now some people are doing it for pennies sort of speak. See Copenhagen Orbitals, nobody laugh at them anymore.

The genome cost billions to do it the first time around, now it costs thousands of dollars.

So I am curious what goods do you refer too?

The only way to compete with centralized production is by being decentralized, no more being dependent of one company/entity/person to deliver the goods.

In fact property rights are an impediment for economic growth, those days where a few could demand others to do something because they hold all the cards are over, in this cycle people are going back to the basics, they will start producing their own products that means companies will get desperate and try to block that from happening.

Which means more granted monopolies attempts until the people get fed up and fuck them up.

Talking about the weight of digital data only seems to consider devices that use Flash memory, like the Kindle or iPad. That's all on the user's end. Isn't most digital data in the world currently stored on hard drives, which use magnetic recording methods? Unless I'm mistaken, a full hard drive shouldn't weight any more than an empty one because the surface of the disk isn't changing states, the magnetic patterns are just being re-ararnged.

What he should find more amazing is his own arrogance that he actually understands something because he makes money from other peoples content.
He willfully ignores his own bias and arrogance while saying it is the other guy who is arrogant.

If someone framed this exact same argument in the same terms, but in areas where he isn't an "expert" he would call them an idiot. Sometimes it is useful to step outside yourself and look at the situation without all of your own bias in the way.

Approaches zero

Personally I find it more accurate so say the marginal cost approaches zero. Compared to the cost of printing, transporting, and storing physical goods, they may as well be zero.

More than anything the guy who wrote the original article seems completely oblivious to scale. Sure a "data warehouse" (whatever that is) might be more expensive than a physical one that takes up the same amount of real world space. It also holds countless orders of magnitude more. And when you compare based on capacity, the only thing that matters, digital is vastly cheaper and more efficient. He'd have to be willfully blind not to see that.

Let the old guard scream.

Let them thrash around, and waste their resources in-effectively trying to hold back the sea. Their paid trolls, shills, and lawmakers will all try to stem the tide. To no good. There is no fighting an idea that's time has come.

What they'll find, is that having spent their resources fighting against it, they have little left to join it when they no longer have a choice. By fighting the future, instead of embracing, guiding, and becoming leaders of it, they effectively are destroying any profitable future they may have.

Interesting...

Here's the thing. I agree that digital warehousing, storage, and whatnot are very very cheap compared with housing physical goods.

But he is right to say that data storage and bandwidth is not free. For a popular site, the costs can be relatively enormous.

Here's the thing. The ones who bear the cost of the storage and bandwidth are usually not the ones who actually produce the material to be stored or transmitted. They are the ones who have found business models that are content-agnostic, scalable, and sustainable.

Yet, the "warehousing and delivery" aspect is almost always brought up by the idiots claiming "free content isn't free." See e.g. Lowery's letter to Emily White: "It turns out the supposedly 'free' stuff really isn’t free" (linking to this story at Scholarly Kitchen). This directly contradicts the notion (again from Lowery) that sites like iTunes "simply hosting the songs on their servers. They do absolutely nothing else."

Well, yeah, they do something else. They warehouse, archive, and deliver digital goods. Something that Lowery insists costs money and isn't free. Yet, they are absolutely livid when the people who actually bear those costs take a cut of the content sales. It is really disgusting, frankly.

Re: Interesting...

Re: Econ 101 FAIL

Congrats on not understanding the difference between Fixed Cost and Marginal Cost and why Fixed Cost (which is what you brought up) is irrelevant here.

This particular commenter has basically been professionally "not understanding" the difference between fixed and marginal costs in our comments for over five years now. It's impressive. It's been explained to him probably over 100 times and he still feigns ignorance. At this point, it's pretty clear he's not here for honest debate.

Re: Re:

The heat energy absorbed whilst the unit is on or even just during the day far outweighs any energy gained whilst storing data.

The thing about flash and hard disks is they actually start in a higher energy state before you put data on them as you change 0xFF (all 1's, highest energy) to whatever data you have.

So I would counter the argument that kindles get heavier and would say they actually get lighter by an atogram. But that is coming from an electronic engineer (me) who has slightly more grasp of physics and how flash works than the guy that original said this argument (Professor Kubiatowicz) who is a computer scientist.

My response to him ....

Interesting article. You are not a techie? There is this rule of thumb for computing called Moore’s Law that states, the power of computing will double every 18 months and the cost will remain roughly the same. The same can be said for data storage technology, which is also an exponential curve.

With in the next 7-15 years we will have nanotechnology. This will allow us to store all current human knowledge (circa 2012) in storage device the size of a deck of cards. Your image of warehouse sized storage facilities is reminiscent of early 1950′s speculation of computing. By their standards, using vacuum tubes and relays, a home PC would have been the size of a house and used the power output of a coal fired power plant.

If someone knows for sure, please, offer some typical numbers to clarify the situation.

Anyway, "warehousing and delivery" feels like a bullshit argument from someone who wants to disguise his profit margin. I mean... look at Spotify: they use more storage and bandwidth than anyone and their service is still ridiculously cheap. That said, I'm guessing there's other costs for a STORE, that wouldn't affect a streaming service nearly as much.

I'm thinking the costs of a typical online store for purely digital goods pretty much break down like this:

Bandwidth: 1%
Storage: 1%
Tech staff: 3%
Customer support: 95%

I could be wrong of course, but bandwidth is cheap, and so is hardware. If the system is cleverly built, you don't need a huge number of techies to maintain and develop it. The one big expense I can think of is the same as always, customer support: my order didn't arrive, you sent the wrong CD, your site is rejecting my VISA card... That has to be the real cost, right? That's where you need a warehouse (full of support staff, not hard drives) costing you a pile of money every month.

If someone argued that the staff is a big expense in online distribution of digital content, then I'm thinking they may be right. If someone talks about expensive bandwidth and storage, I just feel like jumping up on a table, waving my arms and screaming like a monkey.

Anyway, that's just what I'm thinking. If anyone has any actual experience with it, by all means chip in.

Re:

Or well, when I write "my order didn't arrive, you sent the wrong CD" I'm just having a brain freeze, it'll sound more like:

"I didn't get the email with the download link..."
"The download link is linking to the wrong product..."
"I ordered the wrong season of Star Trek by accident..."
"Internet Explorer says your site is dangerous!"

...and so on and so forth. Physical or digital - shit happens, and at the end of the day, customers want a human being to sort it out for them.

Huh. While I can't speak for proper 'warehousing' goods (for example, the iTunes store or even the Wikileaks server farm), personally speaking, at home, buying and filling my 1TB ext HDD turns out to be a lot cheaper and easier than buying and filling a cupboard for physical copies of movies and CDs. I'm not too hot at estimating weights anyway, but I'm assured that it's a lot lighter as well.

Re: Approaches zero

well, i build data warehouses, for many different industries, and I can tell you it ain't cheap, by any stretch. yes it is getting cheaper (Moore's Law) but physical space, climate control, electricity, connectivity, even wiring is all involved prior to adding any digital 1's and 0's. so marginal costs are not really approaching zero.

if you have/know an IT department, i invite you to ask them what their overhead is monthly. some data centers do in fact sell digital goods, most do not, and both have overhead. lots and lots and lots of overhead.

Re: Re: Re:

> The thing about flash and hard disks is they actually start in a higher energy state before you put data on them as you change 0xFF (all 1's, highest energy) to whatever data you have.

From what I know, that is not true. For flash memory, writing means injecting electrons into a floating gate; this makes the output read as 0 due to the way it is wired (the normal state reads as 1). So what reads as zero is the state with more stored energy.

As to hard disks, they start at 0x00, not 0xFF (just look at the contents of a new hard disk).

Re: An attogram more?

I'd go him one better:

"First, digital goods are not intangible. They occupy physical space, be that on a hard drive, on flash memory, or during transmission."

Fine, I'll take my buddies external hard drive with 3,000 CDs and 800 movies on it. We'll load physical copies of those movies and CDs into boxes for Mr. Anderson to carry (all at the same time - assuming he'd even be able to lift them all at once). Then we'll go for a five mile walk. We'll see who gets there first.

Re: Re: Econ 101 FAIL

To quote the 4th Doctor- "The very powerful and the very stupid have one thing in common. They don't alter their views to fit the facts, they alter the facts to fit their views. Which can be very uncomfortable if you are one of the facts that need altering."

Re:

A medium-quality microphone: $100. A good (GNU+)Linux distribution: Almost free if downloaded. A decent computer: Less than $1500. Internet and CDs: Under $20. Using sites like Jamendo, Libre.FM or the Internet Archive to host the songs: Almost free for both sides. Promoting the album: Word-of-mouth is free, ads are really cheap (under $10 per month), and social networking does wonders. All in all, almost anyone can make a successful album with a month's salary. And the cost can be easily recovered (and even multiplied) if those meager costs are crowdfunded beforehand!

Re: Re: Approaches zero

OK, the initial costs are actually rather large. But, on the large scale, the costs are lower than physical alternatives, require much less resources to maintain, and have an extremely larger deployment area thanks to the Internet. In short, it may not be exactly zero, but it's very, very, VERY cheap compared with alternatives.

Mixed Vertical Markets

I think the biggest flaw in Anderson's logic is he is assuming the IT costs for the ENTIRE process vs. just the necessary IT costs for the Entertainment industry.

Here's an example; Hollywood doesn't need to create huge datacenters for each movie they make to distribute it electronically, just as they don't need to directly procure and manage a fleet of trucks to deliver DVDs to customers.

Existing datacenters are there with more than enough capacity and he should only consider costs related to the production of content and server hosting/co-location costs. He doesn't need to worry about what it cost to build the facility, cool it, or anything else that's not in his own segment.

...Unless he's just trying to make a point that "It costs money to build datacenters with lots of disk drives that send and receive lots of data". That's kind of a no-brainer. But for people who produce content and need to leverage that infrastructure for distribution; it's painfully cheap.

Re: Re: Econ 101 FAIL

"This particular commenter has basically been professionally "not understanding" the difference between fixed and marginal costs in our comments for over five years now. It's impressive."

Actually, what is impressive is that someone with your level of education seems unwilling to accept that the rules are somewhat different in a business where most of the costs are fixed costs, or pre-production costs. Paying attention ONLY to marginal costs will put you in the poor house.

It's impressive that you seem unwilling to acknowledge how the whole process goes, and that you would rather narrowly focus on one area. It's entirely misleading, and you know it. Why keep it up?

Re: Re: Re: Econ 101 FAIL

Actually, it's you who are impressing by trying to detract from the point of the article. Nowhere does it discuss the products being stored, or the cost to produce them. Everything else that you keep wanting to bring up is irrelevant to the article and you're acting rather thick just to try and keep going "but what about my $100 million movie?!?!" (Wow. You're like bob! And that isn't a good thing.)

As for Mike, well... again, since the article isn't about what you want it to be about there's no need for him to acknowledge any of that. The topic is something else. Why you keep trying to bring up the rest of the process is entirely beyond me. It's irrelevant. Why keep it up? Can you not acknowledge that the warehousing and delivery of digital goods is cheap and trivial (to an extant)? Answer that. Focus only on that. Leave the rest for a related article. Otherwise, realize type of behavior you're exhibiting is why people mock you and don't take you seriously.

Human Costs (to: maclypse, #33)

Precisely. And when you make the content free, most of that stuff goes away. You don't need to know anyone's VISA card number, and you cannot be accused of having sold the VISA number to the Russian Mafia. If a download hangs up, the reader just hits the reload button, and no hard feelings. You avoid Java, Javascript, Flash, and Cookies like the black plague, and that keeps Internet Explorer happy.

Reputable medical research is mostly paid for by grants from reputable funding agencies, such as the National Institutes of Health. The additional work of preparing things for publication is small compared to the cost of actually doing the research, and the cost of actual distribution even smaller. Parenthetically journal articles tend to be so specialized that the potential audience is less than a thousand, or certainly, less than ten thousand. The cost of distributing the information is also small compared to the cost of reading it. At any rate, sooner or later, the more reputable funding agencies will want to publish the research they sponsor, in order to disassociate themselves from research funded by less reputable organizations. The sponsors who hang onto publication in proprietary journals will inevitably be those sponsors, such as the tobacco industry, who have the most to lose from a true statement of who is paying for the research.

One of the value added features of a Journal is their reputation. If a particular Journal is known to reject bad submissions, then the material they do publish gains credibility by association with the Journal. If the Journal is known to favor bad submissions (Think National Enquirer for an example) then the material loses credibility by association with the Journal.

Regardless of what they publish each Journal pays Editors, copywriters, customer support/sales, peer reviewers (these may be low or no cost depending on availability and quality, but generally there will be a quality dependent cost) and then there is the per issue cost for assembly and publication of a print edition & monthly costs for the IT and internet presence. Many of these costs have a fixed component+marginal component so that a small distribution has a much higher per copy cost with each edition.

Individual researchers are free to publish their work independently, this has not changed in centuries. However it is usually established researchers with a solid reputation or well established research centers with a solid reputation publishing these works. Publications from unknowns will automatically be judged as poor or fraud simply because they did not get into a 'proper' Journal.

The Journals need to pay their bills so they price their submissions & subscriptions according to their estimated costs & income. Better quality with low distribution will cost more, larger distributions will generally lower the per piece cost & mediocre quality will lower it still more. There is a bottom where the price levels out due to periodicals priced below the point where they are taken seriously being ignored. (That seeming paradox is why small items sell better at $19.99 than they do at $9.99 ... the lower price means it isn't worth $20 :P ) For mass market periodicals the sweet spot minimum seems to be $4-$6 based on checking newsstands. Limited subscription periodicals do not have the economy of scale that allows those prices though.