Managing Software Debt 2015 Predictions

2015 is fast approaching and for the first time I felt the urge to make public predictions on what the new year will bring through the lens of Managing Software Debt. Interestingly enough, many of these predictions revolve around topics this blog was constructed to discuss. This blog has been a long time coming for me since my last official blog post prior to November was in 2012 so let me take a slight diversion to describe my reflections on how this blog came into being.

My last blog gettingagile.com had run its course after 178 posts from 2005 to 2012 and it seemed to me that we were Beyond Agile and I needed a different focus. After publishing Managing Software Debt: Building for Inevitable Change in 2010, a reference on 5 types of software debt and how they can be managed and monitored, it seemed that focusing on leading change through Continuous Delivery and reducing Configuration Management Debt had the best results in my consulting experience. Since that time, the DevOps movement has hit a full stride and embodies this approach for leading change in organizations. I think we are on the precipice of our next industry revolution to reduce the cost of change for software as cloud has become a common enterprise platform of choice. With that, here are areas of the software development, deployment and operations ecosystem that I think are going to see significant interest in 2015.

  • PaaS (Platform as a Service)
  • Twelve-Factor Apps and Microservices
  • Feature Teams around Business Capabilities
  • Deployment Orchestration

PaaS

OK, OK. I know that my new role is Product Owner for PaaS at CenturyLink Cloud but there is a good reason I took this role in November (time to note my disclaimer that the views expressed on this blog are mine alone). In my last few roles the impact of infrastructure development enabling deployment foo applications and services to cloud platforms was significant enough to cause me pause. It seemed that the problems being solved were similar across development efforts: load balancing apps & services, event publishing & processing, service discovery, continuous delivery pipelining, blue/green deployments and infrastructure provisioning just to name a few. We had looked at multiple vendor offerings but what we saw prior to 2014 had been, in our opinion, immature. As 2014 progressed, the popularity of containers kicked off a valuable conversation about separation of concerns for deployment and infrastructure. Containers provided a piece of the solution that allowed infrastructure and application/service development to execute within their own life cycles beyond what Puppet and Chef had done for configuration management.

After I heard about the Product Owner role here at CenturyLink Cloud, where I’d be working with a team to deliver PaaS based on Cloud Foundry, I updated my knowledge of the PaaS space. I had already been playing with Docker and worked with others who implemented a build pipeline for our services based on Docker containers. Through this process it was clear that there were still significant problems to solve beyond the ease of development story that Docker was just an introductory chapter to. While researching Cloud Foundry again, after trying it out back in 2012 and deciding not to use it, I was pleasantly surprised how far the platform had come. Immediately I took notice of an aspect of Cloud Foundry called Warden which manages isolated, ephemeral, and resource-controlled environments (aka containers). It been around since November 2011 and had the full Cloud Foundry ecosystem surrounding it which looked to help solve more of the problems in the infrastructure and application/service deployment space than other alternatives available today.

As more enterprise developers see the benefits of PaaS, such as ease of development and deployment with low overhead for configuration management and operations involvement, there will be a large upswing in its adoption. Also, folks in operations will further benefit and enable the DevOps culture as PaaS continues to mature allowing for more self-service provisioning and deployment while still getting the visibility needed to support service level agreements and control operating costs. Look for more on PaaS in this blog as we learn more about how customers innovate and deliver on our upcoming PaaS offering.

Twelve-Factor Apps and Microservices

My last post on The Imminent Acceleration of the Twelve-Factor Apps Approach already discussed why I think The Twelve-Factor App and Microservices will be big in 2015. Some folks in our industry are already realizing the benefits of these approaches. It is probably not surprising that this realization was found mostly outside of the monolithic ESB and SOA tool vendor offerings. Instead, the rise of SPAs (single page apps), RESTful APIs, OpenID/OAuth, cloud computing, open source and many other emergent approaches from the community have been the potion for increased adoption of service-oriented architectures. Look for significant changes in enterprise application development to support The Twelve-Factor Apps approach and implementation of Microservices (or at least less monolithic) as 2015 progresses.

Feature Teams around Business Capabilities

As the chapter “Platform Experience Debt” from the Managing Software Debt book explained, organizations are more flexible when there is clearer alignment of teams to business capabilities and ultimately to their users. The rise of DevOps has brought the cultural changes needed to be more adaptive (and dare I say “agile”) to light and ignited a follow on movement from the software development centric agile movement to incorporating production operations as an aspect of team responsibility. This has made it even more apparent that the Feature Teams collaborative team configuration helps drive alignment of business capabilities with user needs along. Not only that, this organizational alignment creates less brittle boundaries between teams than component (or functional) team configurations do. DevOps has definitely made its mark with Gene Kim’s book The Phoenix Project and the outbreak of DevOps oriented conferences around the world. Look for the DevOps movement to accelerate as more real world change stories are shared in the new year.

Deployment Orchestration

With the rise of PaaS, The Twelve-Factor App and Microservices, the need for more effective deployment orchestration tools and processes will grow. Enterprise Operations groups will need strategies for dealing with the increased frequency of deployment, proliferation of environments and running processes, and hybrid models with internal data centers and cloud architectures used in conjunction with public cloud provider offerings. Continuous Integration servers and access to server instances are not enough. The number of deployment models, platforms, and network topologies will make governance a mess. In 2015 we will need to start finding solutions to orchestrating deployments from build to validation to deployment and to governance. There is a lot of room to innovate and make a significant impact in Deployment Orchestration. I’m excited to see what is coming to solve Deployment Orchestration challenges in the new year.

This is my first attempt at a predictions blog. Let me know how I did on Twitter (@csterwa). I’m always looking for feedback.

Have a Happy New Year 2015!

An Experience with Microservices Approach

James Lewis and Martin Fowler published an article on Microservices in March, 2014. The tendencies of a microservices based architecture were well laid out by these highly regarded authors. In this article I would like to provide some first-hand learning we had implementing software using the Microservices architecture approach.

The Why

To start with, lets describe why we approached the software in this manner. When our team was forming into a cohesive unit we were using existing legacy platform tools within a company on a new product in an adjacent market. These platform tools were fairly progressive and yet were still under heavy development along with showing warts of a monolithic architecture approach over the past 5-10 years. The platform had tight coupling, circular dependencies and teams could not work in isolation on cross-cutting aspects of the platform such as the UI controls and client-side data stores. Also, there were performance issues on client and service APIs that were starting to be made visible with larger customers with more data to manage. Since we were creating a new product we soon found that using the same platform tools and APIs were going to slow us down and potentially we would inhibit other teams working on resolving these issues.

In my previous engagements, the architecture patterns that supported long-term needs were those that allow for changeability. Changeability tended to go hand in hand with a *nix-like approach of components that do a single thing (Single Responsibility Principle) and involved low coupling with adjacent and/or dependent components (Just check out SOLID principles for more detailed information that every developer should learn). I had success with approaches that supported these 2 main ideas on many software teams and witnessed as a consultant many more architectures that I would also deem as successful even over time. The visibility into the infrastructure and service design at Netflix also influenced just how far we should go to develop software that would evolve naturally with the changes in the business. Thus we embarked on a journey to implement our software in a manner that would allow for flexible deployment of business capabilities in microservices.

The Domain

We were developing software for an adjacent market that we had co-defined with customers through years of experience consulting in the domain and running experiments for problem/solution fit using a Lean Startup approach. The business capability had become fairly coherent at a high level domain model perspective. We knew the parts and how they would fit together in order to create our first MVP (Minimum Viable Product). The size of the business capability was still such that it involved multiple responsible components that each had their own logic and user interactions at the client and API. To not over-complicate our development we decided to create RESTful stateless services based on Dropwizard, an authorization and external API consumer layer, and a client-side UI based on AngularJS. We used MongoDB as a main persistent storage due to the nature of the data we were supporting and PostgreSQL for user permission management.

Even though we had sufficient learning to focus on delivering the MVP to customers there was still learning to be had with those customers we were co-creating the software with closely in a beta capacity in real world situations. This meant that we needed to absorb change in all aspects of the product, client-side or in our services. Not only that, we had to deploy those changes quickly to learn if they provided an actual solution to our customer’s need. We had an effective Continuous Delivery (CD) pipeline that allowed for all services and client-side UI to be built, tested from multiple perspectives, and deployed into staging and production environments. This also included a separate pipeline for Chef cookbooks that were used to bootstrap instances on Amazon EC2 from scratch. All of this infrastructure allowed us to deploy changes at any commit to master on any source code repository that was being watched by our CD pipelines.

The Product Owner had a button they could push at any time to deploy what was in staging into production without any scheduled downtime. This was enabled through our rolling deployment approach that involved taking vertical slices of our environment out of rotation, deploying to them, running smoke test verifications on that slice, and then putting them back into rotation and then continuing to the next slice. None of this necessitated a microservices approach although it was not much more difficult than other approaches I’ve had first hand knowledge of and it provided nice isolation of capabilities within the product.

And Finally, the Learnings…

There were many learnings that we came away with. Some were specific to the context of our company and others, the ones I will share here, were more general in nature. Of course, these are in retrospect and with some (OK, maybe a lot) of opinion baked into them and I hope they are useful to others whether or not they are followed directly or just spark conversations.

Aggregate Logging

Effective logging is essential for finding resolution to issues in any software. When you have services and clients running across many instances the need for aggregating logs to resolve issues becomes even more important. On top of that, if you have production access policies, such as those found in FDA, HIPAA, and PCI just to name a few then development, teams are restricted from direct access to the running instances, personally identifiable data, and network traffic. Therefore logs must trap and identify not only levels of logging but also define consistent patterns for logging. Teams should discuss and agree on their logging patterns that also include “backstops” for exceptions or unexpected issues that aren’t captured in the implementation code. Pulling these logs into a central services such as Logstash and Splunk.

Focus on Boundary Context of Services

Using techniques from Domain-Driven Design (DDD) with special attention paid to domain models, ubiquitous language, and bounded context will help in defining where capabilities are to be separated into their own services. The time put in by the whole team in defining and understanding the language and bounded context of each capability in the domain enabled client-side code to easily separate access to each service without coupling calls across multiple services. We could have a pair of developers working on one view and service and another pair working on a separate view and service typically without affecting each other’s work.

Lookup Configurations from Deployment Environment

When deploying into multiple environments, such as development, staging, and production, it is important to allow per-environment configuration. These environment configurations could include service endpoints, database access, logging, access tokens, and more. There are many techniques for setting configurations for lookup by running processes. Some examples are shell environment variables, by URL with XML or JSON response, and coordination services such as Zookeeper and etcd. This allows operational configuration of services and access policies to environment authorization tokens to be supported.

Cohabitate Highly Cohesive Code

For some reason, putting code based on multiple languages into the same source code repository feels a bit dirty, at least it did for me. At the same time, there are many different aspects of a service within its bounded context that may necessitate multiple tools to be used. For example, we may want to provide shell scripts to deploy our code alongside the service’s business capability focused code. Some other aspects that should be considered as part of the service’s source code repository are Chef or Ansible instance configuration code, PaaS (Platform as a Service) configuration files, build scripts, instance launch automation scripts, and probably the most controversial suggestion I will make is also serving client-side code that specifically interacts with the service’s endpoints. Since serving client-side code from the service itself may be controversial, here is an example:

Given the following service API endpoints:

GET /api/items

PUT /api/items

POST /api/items/{id}

We might serve a JavaScript API client that has functions for interacting with each endpoint:

{

getItems: function() { … };

addItem: function(item) { … };

updateItem: function(item) { … };

}

This allows the client-side code that interacts with the service to be tested and updated at the same time as the service itself. If the client-side code is put into a separate source code repository and is built and deployed separately then there become situations where the client and service code changes are not independent in the deployment process. This will lead to situations where code must be deployed in a specific manner that is not in alignment with the Continuous Delivery pipeline.

Monitoring and Alerting are Essential

The flexibility of horizontal scaling of stateless services and federated data comes at a cost. It is at times difficult to know the path that is being taken through the application and which instances are involved. Finding ways to identify specific service instances in logs, on clients and in monitoring tools such as New Relic. This will reduce the Time to Resolution in many cases and keep you sane when tracking down causes of issues.

On top of monitoring, finding ways to alert when issues need to be looked into using tools such as PagerDuty will help the team track down issues quickly after their introduction into the environment. Virtual machines get rebooted, instances fail, networks get blocked, syntax errors in configurations cause outages, and any number of other issues can cause problems in an environment. I recommend become familiar with the Fallacies of Distributed Computing to help with thinking about ways that your software can fail in any distributed system. Even browser or mobile clients connecting with services is a distributed system and can fall victim to these fallacies.

Isolate Deployment Slices for Verification

To keep your services highly available it is important to have a deployment process that allows for changes to be introduced without scheduling downtime maintenance. Mostly if teams are deploying frequently, maybe multiple times per day, into an environment. Finding mechanisms to continue serving consumers of the service while deploying new versions makes deploying software a business decision rather than a technical hurdle to leap over. Our process was something like the following:

Have at least two isolated slices deployed to in an environment

Take one isolated slice out of rotation and direct all consumers to other slice(s)

Deploy to the out of rotation isolated slice

Run smoke tests that are configured to test the out of rotation isolated slice

If smoke tests pass then bring the isolated slice back into rotation

Rinse and repeat with other slice(s)

This overly simplified high level process overview has many complications that need to be resolved based on the configuration of the environment being deployed to. PaaS and other approaches to automate service deployment, networking, and configuration management can go a long way to help make this process less impactful to a team’s feature delivery by taking care of the complications through tools and APIs.

Conclusion

The Microservices architecture approach can provide effective boundaries between capabilities in a system and provide tremendous flexibility not easily attained through other more monolithic approaches. There is a cost to the microservices architecture approach in terms of a more complicated environment setup and deployment. These can be overcome through the learnings above and applying techniques provided online by many folks using this approach. If your team is creating new business capabilities I highly recommend taking a deeper look into how a microservices approach could help provide essential flexibility and scalability that is needed in modern software solutions.