The Network Formerly Known As The Information Superhighway

Berlin, Germany

We're going to talk a little bit about the internet.

We're expanding on the high level concept of multiple sources of truth and more focused domains like devices, deployments, and user accounts that taken together comprise a field of truth - a large mass of information that software developers have to navigate to answer simple questions: "what happened?" and "why did this happen?". At the same time, we are also returning back to the model of software as digital manufacturing, because the internet plays an important role. In physical manufacturing there are distribution channels - physical goods for use in everyday life make their way from a factory and through some journey arrive in a store or perhaps a warehouse, and eventually in someone's home. They do this by navigating a network of public roads, perhaps hopping between courier services.

In the digital world, the internet is that network of roads.

There are a few important points of difference. One is that the process of distribution is largely driven by autonomous systems, even more so than the traffic systems of the physical world. Another is that our vehicles, transmitting data back and forth over the internet have no agency. They are instead directed by other parties. A third point is that while there can what amounts to a form of gridlock, the way that gridlock is resolved is quite different. If information is not received quickly enough, it is discarded. These are generally applicable truths across a variety of contexts. If you were to travel to remote areas of the world during holiday periods, you can probably witness some of this first-hand if you were to attempt to access popular apps like YouTube or Spotify using the internet. You'll probably experience extremely long response times, perhaps as though nothing is happening at all.

The way that data is collected, organised, and processed around the world has changed a lot over the last few decades, and much is being documented in real-time today about how the make-up of this global network is changing. In the last decade or so, large data centres (think warehouses of computers) have been constructed and utilised around the world, creating aggregation points spread over the world that all kinds of software applications could rely on, at a price. This is the cloud. This enabled new kinds of software, as purchasing infrastructure - a large upfront cost, was no longer a pre-requisite to creating applications that depend on large volumes of data.

Networks and cloud infrastructure are a really broad topic. To relate it back to our grand narratives of digital manufacturing and the field of truth - there are a number of ways in which their existence shapes how users understand software. Users have become accustomed to the cloud being there, and doing a lot of work processing data before delivering it on behalf of their device. This is a pretty big challenge for large offline-first apps. The cloud has, at least in principle, given us the ability to scale computing infrastructure from anywhere between zero machines to a very large number of machines, autoscaling based on demand. For applications with a low usage volume this is problematic in the real world because it can take a considerable amount of time to get from 0 to 1, perhaps dozens of seconds. This is long enough that if it happens in the public sphere, it can materially shape the consumption patterns of your application. Over-optimising for cost can kill your project.

Security is a consideration in the modern environment. Modern cloud providers automatically block traffic it deems suspicious, and sometimes this results in the blocking of legitimate traffic, which can create problems at any stage in the software development lifecycle. What happens in the cloud is one part, let's talk about the other side for a moment.

It is easy enough to know if your device is connected to a network, we can use the concept of the handshake agreement established earlier, but that's not quite the same as knowing whether it is connected to the broader internet - we can't shake hands with everyone around the world can we? So how does a device know it is connected to the internet? How does a device know if a website is down? If you pause for a moment to really think about this, you might think about the device visiting a very reliable particular address, or perhaps multiple addresses that probably will not be inaccessible simultaneously. There is no convenient and foolproof way around this. Your device could periodically make (arguably wasteful) requests to a number of addresses, or it could hold out hope for a response for a limited amount of time, and if that time exceeds some threshold, consider it a connection failure. Different software platforms take different approaches to this, but they all make educated guesses. The global network of digital roads has no centralised authority.

If you have a globally available mobile application, your users could be anywhere. The information superhighway may be flowing at high speed, or it may be in gridlock. Another dimension for the field of truth.