Nothing Works Anymore So Plan Accordingly

It’s perspicacious, so read the whole thing. On a related note, I’ve literally been working on finding a solution for the shipping problem for Europe all morning. And the steps we are probably going to have to take to resolve the issues involved are absurd to the point of bordering on the comedic. The good news is that should we ever feel the need to branch out into trafficking various forms of contraband, we will have a comprehensive network in place.

There’s a cocktail party version of the efficient markets hypothesis I frequently hear that’s basically, “markets enforce efficiency, so it’s not possible that a company can have some major inefficiency and survive”. We’ve previously discussed Marc Andreessen’s quote that tech hiring can’t be inefficient here and here:

Let’s launch right into it. I think the critique that Silicon Valley companies are deliberately, systematically discriminatory is incorrect, and there are two reasons to believe that that’s the case. … No. 2, our companies are desperate for talent. Desperate. Our companies are dying for talent. They’re like lying on the beach gasping because they can’t get enough talented people in for these jobs. The motivation to go find talent wherever it is unbelievably high.

Variants of this idea that I frequently hear engineers and VCs repeat involve companies being efficient and/or products being basically as good as possible because, if it were possible for them to be better, someone would’ve outcompeted them and done it already.

There’s a vague plausibility to that kind of statement, which is why it’s a debate I’ve often heard come up in casual conversation, where one person will point out some obvious company inefficiency or product error and someone else will respond that, if it’s so obvious, someone at the company would have fixed the issue or another company would’ve come along and won based on being more efficient or better. Talking purely abstractly, it’s hard to settle the debate, but things are clearer if we look at some specifics, as in the two examples above about hiring, where we can observe that, whatever abstract arguments people make, inefficiencies persisted for decades.

When it comes to buying products and services, at a personal level, most people I know who’ve checked the work of people they’ve hired for things like home renovation or accounting have found grievous errors in the work. Although it’s possible to find people who don’t do shoddy work, it’s generally difficult for someone who isn’t an expert in the field to determine if someone is going to do shoddy work in the field. You can try to get better quality by paying more, but once you get out of the very bottom end of the market, it’s frequently unclear how to trade money for quality, e.g., my friends and colleagues who’ve gone with large, brand name, accounting firms have paid much more than people who go with small, local, accountants and gotten a higher error rate; as a strategy, trying expensive local accountants hasn’t really fared much better. The good accountants are typically somewhat expensive, but they’re generally not charging the highest rates and only a small percentage of somewhat expensive accountants are good.

More generally, in many markets, consumers are uninformed and it’s fairly difficult to figure out which products are even half decent, let alone good. When people happen to choose a product or service that’s right for them, it’s often for the wrong reasons. For example, in my social circles, there have been two waves of people migrating from iPhones to Android phones over the past few years. Both waves happened due to Apple PR snafus which caused a lot of people to think that iPhones were terrible at something when, in fact, they were better at that thing than Android phones. Luckily, iPhones aren’t strictly superior to Android phones and many people who switched got a device that was better for them because they were previously using an iPhone due to good Apple PR, causing their errors to cancel out. But, when people are mostly making decisions off of marketing and PR and don’t have access to good information, there’s no particular reason to think that a product being generally better or even strictly superior will result in that winning and the worse product losing. In capital markets, we don’t need all that many informed participants to think that some form of the efficient market hypothesis holds ensuring “prices reflect all available information”. It’s a truism that published results about market inefficiencies stop being true the moment they’re published because people exploit the inefficiency until it disappears.

But as we also saw, individual firms exploiting mispriced labor have a limited demand for labor and inefficiencies can persist for decades because the firms that are acting on “all available information” don’t buy enough labor to move the price of mispriced people to where it would be if most or all firms were acting rationally.

In the abstract, it seems that, with products and services, inefficiencies should also be able to persist for a long time since, similarly, there also isn’t a mechanism that allows actors in the system to exploit the inefficiency in a way that directly converts money into more money, and sometimes there isn’t really even a mechanism to make almost any money at all. For example, if you observe that it’s silly for people to move from iPhones to Android phones because they think that Apple is engaging in nefarious planned obsolescence when Android devices generally become obsolete more quickly, due to a combination of iPhones getting updates for longer and iPhones being faster at every price point they compete at, allowing the phone to be used on bloated sites for longer, you can’t really make money off of this observation. This is unlike a mispriced asset that you can buy derivatives of to make money (in expectation).

A common suggestion to the problem of not knowing what product or service is good is to ask an expert in the field or a credentialed person, but this often fails as well. For example, a friend of mine had trouble sleeping because his window air conditioner was loud and would wake him up when it turned on. He asked a trusted friend of his who works on air conditioners if this could be improved by getting a newer air conditioner and his friend said “no; air conditioners are basically all the same”. But any consumer who’s compared items with motors in them would immediately know that this is false. Engineers have gotten much better at producing quieter devices when holding power and cost constant. My friend eventually bought a newer, quieter, air conditioner, which solved his sleep problem, but he had the problem for longer than he needed to because he assumed that someone whose job it is to work on air conditioners would give him non-terrible advice about air conditioners. If my friend were an expert on air conditioners or had compared the noise levels of otherwise comparable consumer products over time, he could’ve figured out that he shouldn’t trust his friend, but if he had that level of expertise, he wouldn’t have needed advice in the first place.

So far, we’ve looked at the difficulty of getting the right product or service at a personal level, but this problem also exists at the firm level and is often worse because the markets tend to be thinner, with fewer products available as well as opaque, “call us” pricing. Some commonly repeated advice is that firms should focus on their “core competencies” and outsource everything else (e.g., Joel Spolsky, Gene Kim, Will Larson, Camille Fournier, etc., all say this), but if we look mid-sized tech companies, we can see that they often need to have in-house expertise that’s far outside what anyone would consider their core competency unless, e.g., every social media company has kernel expertise as a core competency. In principle, firms can outsource this kind of work, but people I know who’ve relied on outsourcing, e.g., kernel expertise to consultants or application engineers on a support contract, have been very unhappy with the results compared to what they can get by hiring dedicated engineers, both in absolute terms (support frequently doesn’t come up with a satisfactory resolution in weeks or months, even when it’s one a good engineer could solve in days) and for the money (despite engineers being expensive, large support contracts can often cost more than an engineer while delivering worse service than an engineer).

This problem exists not only for support but also for products a company could buy instead of build. For example, Ben Kuhn, the CTO of Wave, has a Twitter thread about some of the issues we’ve run into at Wave, with a couple of followups. Ben now believes that one of the big mistakes he made as CTO was not putting much more effort into vendor selection, even when the decision appeared to be a slam dunk, and more strongly considering moving many systems to custom in-house versions sooner. Even after selecting the consensus best product in the space from the leading (as in largest and most respected) firm, and using the main offering the company has, the product often not only doesn’t work but, by design, can’t work.

For example, we tried “buy” instead of “build” for a product that syncs data from Postgres to Snowflake. Syncing from Postrgres is the main offering (as in the offering with the most customers) from a leading data sync company, and we found that it would lose data, duplicate data, and corrupt data. After digging into it, it turns out that the product has a design that, among other issues, relies on the data source being able to seek backwards on its changelog. But Postgres throws changelogs away once they’re consumed, so the Postgres data source can’t support this operation. When their product attempts to do this and the operation fails, we end up with the sync getting “stuck”, needing manual intervention from the vendor’s operator and/or data loss. Since our data is still on Postgres, it’s possible to recover from this by doing a full resync, but the data sync product tops out at 5MB/s for reasons that appear to be unknown to them, so a full resync can take days even on databases that aren’t all that large. Resyncs will also silently drop and corrupt data, so multiple cycles of full resyncs followed by data integrity checks are sometimes necessary to recover from data corruption, which can take weeks. Despite being widely recommended and the leading product in the space, the product has a number of major design flaws that mean that it literally cannot work.

This isn’t just an issue that impacts tech companies; we see this across many different industries. For example, any company that wants to mail items to customers has to either implement shipping themselves or deal with the fallout of having unreliable shipping.

I wish I’d read this six months ago. But at least it confirms the necessity, and the wisdom, of setting up our own shipping centers.