An introduction to API first

#APIs #REST #SUGCON #Microservices

Published on 11 August 2022 by Andrew Owen (5 minutes)

This week’s article is the long-promised expansion of the lightning talk I gave at SUGCON 2022 in Budapest. If you’re coming fresh to the subject, you might like to read my earlier articles on Jamstack and MACH and event-driven architectures.

As developers, you’ll probably be familiar with all the terminology. But in case any lay readers have stumbled across this, I’ll explain the terms as I go along, starting with Application Programming Interfaces (APIs). At the most basic level, an API describes how two or more software processes communicate. These processes could exist within the same software running on the same computer, or across multiple programs on multiple computers.

As with many elements of modern development, APIs can be public or private. Public APIs provide an external interface for third-party software to interact with the system. Private APIs are used exclusively internally. Even in monolithic software (one large program), you may find both types. But in distributed solutions, where software components run across multiple systems, having a well-defined way for components to communicate is essential.

SE Basic IV’s API portal provides a simple illustration of public/private APIs. The left menu contains a list of all the modules, and each module contains a list of all the routines that can be called. Most of these routines are intended only for internal use by the firmware (in an ideal world, they would still be fully documented to make it easier to maintain the software). However, the routines in the Vectors module are different. These are the public APIs. The private APIs are subject to change, but the public APIs are fixed.

This is assembly language, so the vector table ensures that even if the routines move around in memory in later versions of the software, there is a fixed address to call each routine. The parameters don’t change. If there is a better way to do something, a new API endpoint (routine) is added, but the old one is left in place. We say it’s deprecated: that means don’t use it. But it is still there in case there is any software that depends on it. Deprecated API endpoints should only be removed if you’re confident that there is no software in use that depends on them that can’t be easily updated to use the new endpoint.

Depending on when you’re reading this, you may only see the file system API in the list (the console API will be introduced in the next beta). These endpoints are designed for ease of use by assembly language programmers. They encapsulate an individual task, such as loading a file from disk into memory. In fact, to carry out that operation, several calls have to be made to a lower level disk API. But the programmer shouldn’t have to worry about managing the file handle, opening the file, reading the file stats, reading the file from disk and then closing the file. Instead, the programmer passes a file path and the address in memory where the file is to be loaded. The routine and it does the rest, including error handling (for example if the file was too large to fit in memory at the given address).

But that’s old school. Most of the time when we’re talking about APIs we’re talking about Web APIs, typically what is known as RESTful APIs, defined in some kind of schema, such as OpenAPI, or in a Postman collection, using HTTP methods, with payloads in JSON wrappers. That’s a lot to unpack.

JavaScript is the language of web applications, and JSON is a subset of JavaScript objects. If you’re familiar with CRUD (create, read, update, delete), REST does something similar using HTTP methods (POST, GET, PUT/PATCH, DELETE). An API schema is a way of organizing APIs.

First class citizens

APIs:

Provide a common interface to software components.
Decouple the software from the interface.
Enable interaction between systems.

When people talk about API-first, you’ll often hear mention of APIs being first class citizens. At its simplest, this means that the product is split into a backend solution that provides APIs, and a client application that consumes those APIs. Priority is supposed to be given to creating consistent and reusable APIs. Although in practice, APIs are often created in an ad-hoc manner, and some poor tech writer is expected to document the result (in my view, this isn’t API-first).

Documentation matters

I’d go further and say that in a true API-first development environment, the APIs should be devised before anyone starts writing any code. But one of the benefits of APIs is that if you get it wrong the first time, you can create a new API and deprecate the old one, hopefully without breaking anything. This becomes more important as new standards emerge, such as banking and commerce APIs, and as large companies acquire smaller companies and seek to integrate products into a unified API experience.

Another thing I’ve seen a lot of is startups that get most of that right and then don’t document the APIs. Then customers are left struggling to understand how to use them. Developers hate writing docs, which is why I think the best approach is to spec out the API and hand it to the developer. But of course, developers should have input into the API design in the first place. So I propose:

Have an API task group. Do the API first.
Use standards. Use a schema.
Publish your public APIs.

Microservices and beyond

The second tier of API-first is microservices. These components should be as atomic as possible, and completely replaceable. The reality is that many organizations that promote API-first still have a lot of monolithic software to support. But it is possible to retrofit an API to older software. And if you manage to do that, then you can replace components one at a time with microservices until eventually your solution is no longer monolithic.

But APIs are only the start. There’s also:

Components and messaging.
Data lakes.
Event sourcing.

But this is already a long article, and those will make good topics for future articles.