@DevOpsSummit Authors: Liz McMillan, Pat Romanski, Zakia Bouachraoui, Yeshim Deniz, Gopala Krishna Behara

Related Topics: @DevOpsSummit, Microservices Expo, Containers Expo Blog, @CloudExpo

@DevOpsSummit: Article

Taming the API Sprawl | @DevOpsSummit #API #DevOps #Microservices

One of the much-touted advantages of web services is reusability

Taming the API Sprawl
by JB Rainsberger

Ten years ago, there may have been only a single application that talked directly to the database and spit out HTML; customer service, sales - most of the organizations I work with have been moving toward a design philosophy more like unix, where each application consists of a series of small tools stitched together. In web example above, that likely means a login service combines with webpages that call other services - like enter and update record. That allows the customer service team to write their own tools using the web, the command line, scheduled, or any other interface.

Sound too good to be true, doesn't it? It is true, but it comes at a cost. For example, I never defined a mechanism to manage the explosion of APIs that will result under this approach. Consider this: uncontrolled growth is one definition of cancer.

Since the rapid creation of APIs is not quite so deadly, I will call this the API Sprawl; I've seen it across every client that moved to web-services approach, typically two to four years into conversion.

Today I'll tell you how to control it, avoid it, or, if you have to, tame the API Sprawl.

Starting from a Mess
One of the much-touted advantages of web services is reusability. Instead of re-writing the lookup in SQL directly to find out if a member has coverage, we call the coverage_loopup(date, memberId) function. Instead of writing it in one programming language, we expose it by the web, so any programming language can find it.

All wonderful so far.

Sadly, in order to use this function, we need to find it. Without controls, the web services for an organization start to feel a bit like a junk drawer. First we have to go looking for what we seek, either talking to teams or searching through intranet pages. Then, eventually, we find three tools that almost do what we want (we need to know pharmacy coverage, or if the user has hit their deductible, or some other detail). The existing services don't really work, so we write a fourth service, must_pay(date, memberid, coverage_type).

After going through this trouble only a few times, we've created three new services. The whole notion of reusing these services at all becomes suspect.

This cycle helps us more efficiently destroy the spirit of the next well-meaning person who optimistically thinks someone must already have built this....

Taming this API sprawl certainly means putting services where programmers can find them, but I want to be careful. Shiny exiting promises got us into this mess; a shiny new service  repository and a mass email won't be enough to get us out.

Think about it this way: The world's most meticulously-curated library of terribly-written, unedited novels is a joke; it would provide almost no practical value to its community. You can't think "If we build it, they will come." Yes, you have to build it, but if they don't trust it, they will stay away.

A Service Directory
Before the internet we had the yellow pages - a book designed so that people could find phone numbers and addresses for services. The phone book was free to everyone who used it, paid by advertiser of services. If the advertisers weren't discoverable, then they wouldn't pay for the ads and the yellow pages wouldn't be profitable.

That means that the number one issue with the yellow pages was getting people to find what they need, by subject or name. A company service directory should do the same - make services discoverable, and make the registration and change process not overly painful. You can probably find a team member whose eyes light up at this kind of challenge - the idea of hunting down the best tool for the job.

You might also invest time, energy, and money in all this, look up two years from now, and find that not much has changed. Without improving trust among the people who build and use the APIs, you'll find yourself circling back, looking for ways to deal with the situation, and you might even need to read this article again.

Beyond setting up a services repository and publishing it to your group, getting your APIs under control involves improving discipline, a fair amount of thankless work, and considerable patience. It means weeding out duplication among your services, making those APIs self-documenting through better and more consistent naming. There is a need to document and in some cases reverse-engineer the contract of each and every service. This has to happen, not because of some abstract sense of "right" and "wrong", but in order to address the key reason that the APIs keep sprawling under your feet: people just don't trust the services enough to risk their projects on them.

What Builds Trust?
The topic of building and maintaining trust fills a bookshelf, not just a book. You will probably need to read some of these books, but don't despair: I can give you a place to start today. As a rule of thumb, you can start to gain a programmer's trust by making available to them services do what they need and that work. There might still remain deeper trust issues that get in the way, but if you can do this, then you'll go a long way. Good news: this gives you a clear path in the general direction of success. Bad news: this involves some frustrating work. You need to believe that your group is suffering from sprawl enough to merit solving the problem and trust that what I propose will help. (See? Trust problems are everywhere. You really should look into those books.)

The services themselves need to work, because if they don't, then I can't depend on them. Sadly, "it works" ranks as one of the most ambiguous sentences in English, so let me clarify what I mean: the services need to do what they claim to do. This almost certainly means executable examples - what some might call integration tests. I define "it works" to mean "passes these tests". Put the other way, I define "what the services does" in terms of the tests it passes.

Now Add Tests
When I want to use existing code and I'm not sure what it does, I start by writing tests for it on its own. I do this in order to challenge my assumptions about how it behaves. When one of these tests passes, I think Aha! I was right about what it does! and when one of these tests fails, I wonder Is the service broken, or have I made an unreasonable assumption about what it should do? As the collection of examples grows, so does my understanding of the service. From this growing suite of tests, I can make guesses about what the service probably does in similar situations, helping me write a more comprehensive, trustworthy set of tests. When I then publish this service and the examples in a repository for others to use, prospective clients can judge more easily whether it will help them solve their problem, what it will do, what it won't do, and whether using the service requires less investment than building it themselves. It all starts with writing tests.

So write tests. Not just when creating a service, not just when documenting it for the repository - even when teams use a service they can write tests for it, leaving behind a record of what they've learned, their use cases for the service and their expected results. If you run out of time to write tests, then at least document your assumptions and expectations about what a service does. Provide examples, even if you can't run those examples right away. (By writing examples, even as bullet points in documentation, we have an option to turn them into running tests when we have time and energy.) Make those tests easy to find and easy to run. Make those examples easy to find and easy to read. Do this over and over, even when it seems like nobody cares. This illustrates one of the key points of trust: you have to give a lot of it away before you start getting it back from others.

As the suite of tests for a given service grows, so does our confidence in the service, and we can move it some place where others can find it and start using it. In this way, we use a kind of internal open-source model of incubator projects that feed the prestige, published Service Repository. In the incubator projects we find the services that "aren't ready for prime time". These are the ones we're slowly-but-surely testing and documenting. Maybe we're refactoring them to improve the public interface. Maybe we're incrementally rewriting some of them (don't go overboard on this), intending to publish the new versions and phase out the old ones. Over time, we move services from these incubators into The Services Repository. This builds even more trust, because it clearly delineates which services we believe are ready and which ones need more work. It acts as a kind of service level agreement for our APIs: you should expect these ones to work and to be able to understand them, but those ones there-that's still beta territory. Maybe alpha. Be careful. We're working on it.

This is very similar to what we do in programming, when we write some code once, they turn it into a method for internal use, then only eventually, and with some hesitancy, do we add it to a library for some other programmer to use.

With the strategy above, by the time a service is added to the repository, it is discoverable, documented, and has examples.

This is sort of the opposite of our Sprawl Scenario - we are avoiding the temptation so publish a service until we are confident that it is ready for prime-time. As you build trust with the programmers, they'll slowly accept your mistakes more graciously, but it takes time. The more meticulously you go about this, the sooner you'll build the critical mass of trust you need for the system to sustain itself. And it all starts by just writing tests.

Get Your APIs Flowing
Remember: A library with the most well-written, carefully-edited books in the world doesn't succeed if people don't know it exists, can't find anything, and it still has a bad reputation. Yes, you need a services repository, but beyond that, you need to repair some of the damage done to those who have already suffered through the sprawl. You need to regain their trust. I've given you one key technique for doing that, but you'll still probably want to learn more about trust than just the beneficial effects of writing tests. I know that you have your hands full with REST, SOAP, XML, JSON, and all that stuff, but if you don't do the things that will help programmers trust those web services, then you'll just be back re-reading this article in two years, wondering what went wrong.

Make the next book you read about trust, and you'll find it easier to get this sprawl under control. Trust me.

Where to go for more
The technical side of small changes often trips up our goal to be more adaptive; lack of trust can kill change dead in its tracks. Here are a few references to help keep you out of the muck.

Refactoring Databases. It seems strange, but APIs act a lot like database schemas, at least in the sense of being difficult to change. Reading this book will give you some ideas how to deal with changing published APIs.

The Five Dysfunctions of a Team. The next level of depth in understanding the role that trust has in any team situation, including but definitely not limited to API sprawl.

The Speed of Trust. When you're ready to dive deeply into trust as an ingredient in organizational success, you can start here.

More Stories By SmartBear Blog

As the leader in software quality tools for the connected world, SmartBear supports more than two million software professionals and over 25,000 organizations in 90 countries that use its products to build and deliver the world’s greatest applications. With today’s applications deploying on mobile, Web, desktop, Internet of Things (IoT) or even embedded computing platforms, the connected nature of these applications through public and private APIs presents a unique set of challenges for developers, testers and operations teams. SmartBear's software quality tools assist with code review, functional and load testing, API readiness as well as performance monitoring of these modern applications.

@DevOpsSummit Stories
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
While some developers care passionately about how data centers and clouds are architected, for most, it is only the end result that matters. To the majority of companies, technology exists to solve a business problem, and only delivers value when it is solving that problem. 2017 brings the mainstream adoption of containers for production workloads. In his session at 21st Cloud Expo, Ben McCormack, VP of Operations at Evernote, discussed how data centers of the future will be managed, how the public cloud best suits your organization, and what the future holds for operations and infrastructure engineers in a post-container world. Is a serverless world inevitable?
Wooed by the promise of faster innovation, lower TCO, and greater agility, businesses of every shape and size have embraced the cloud at every layer of the IT stack – from apps to file sharing to infrastructure. The typical organization currently uses more than a dozen sanctioned cloud apps and will shift more than half of all workloads to the cloud by 2018. Such cloud investments have delivered measurable benefits. But they’ve also resulted in some unintended side-effects: complexity and risk. End users now struggle to navigate multiple environments with varying degrees of performance. Companies are unclear on the security of their data and network access. And IT squads are overwhelmed trying to monitor and manage it all.
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogger and is a frequent speaker on the use of Big Data and data science to power the organization's key business initiatives. He is a University of San Francisco School of Management (SOM) Executive Fellow where he teaches the "Big Data MBA" course. Bill was ranked as #15 Big Data Influencer by Onalytica. Bill has over three decades of experience in data warehousing, BI and analytics. He authored E...
When building large, cloud-based applications that operate at a high scale, it's important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. "Fly two mistakes high" is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, discussed how this same philosophy can be applied to highly scaled applications, and can dramatically increase your resilience to failure.