“Cloud native is structuring teams, culture, and technology to utilize automation and architectures to manage complexity and unlock velocity.”
Joe Beda, Co-Founder, Kubernetes and Principal Engineer, VMware
Design principles and architectural approaches like API-first design, twelve-factor (stateless) applications, and domain-driven design can be used to make microservices, modular monoliths, and even (gasp!) monoliths themselves into cloud-native applications. But it’s more than application architecture and technology like Iaas, PaaS, and SaaS. A cloud-native application relies on human practices and methodologies, as well as organizational structure.
However, the output and design goal of cloud-native applications results in stateless applications—only one half of the IT puzzle. Applications without data would be useless. Mission-critical, operational data is only in its early stages of transitioning from monolithic to distributed environments. A cloud-native, serverless application that can autoscale depending on user load will still bottleneck at some point if its database cannot do the same. Serverless, autoscaling data changes the cloud-native landscape for both applications and functions alike.
Cloud-native applications defined
Cloud-native applications are a collection of small scope, independent, and loosely coupled services with strong API contracts, built on a foundation of virtualized, commodity servers.
They are specifically (re)designed to fully exploit containerized deployment on distributed IaaS, PaaS, and FaaS environments, and are supported by Agile, DevOps, and CI/CD processes and methodologies.
As stated in the cloud-maturity model, monoliths can run on the cloud, but generally are not considered cloud-native applications as they have more internal interdependencies, resulting in:
- Slow deployment of new features, and reduced change velocity
- Inability to use multiple versions of the same foundation libraries within the app
- Slower regression testing / testing cycles due to a larger code surface area
- No componentization for independent deployment or scaling
For this reason, today, many early-stage (monolithic) applications are being designed as modular monoliths. This additional design effort is merited when there is a reasonable chance those applications will have large code bases, with multiple developers participating concurrently.
So, what specifically qualifies an application or service to be considered “cloud native”? They generally use most of Heroku’s 12 factors and/or new factors that have emerged in the industry, resulting in qualities like:
- Stateless (state is explicitly managed elsewhere)
- Uses lightweight virtualization (Linux containers)
- Service oriented, or microservice architecture, or a cloud function
- Focused on API contracts, allowing flexible (re)implementation
- Deployed to software-defined, elastic virtual infrastructure (IaaS, PaaS, FaaS)
- Highly available, fault tolerant, and self-healing, at least at the IaaS level if not higher
- Managed via Agile, DevOps, and CI/CD processes
- Often dynamically configured and discovered at runtime
- Observable: participates or emits centralized log/metric streaming, APM, tracing
- Run on immutable infrastructure that can be started, stopped, scaled, etc. at any time
However, these 12 factors are a bit dated today. Some modern databases, datastores, and caching technology have emerged that are cloud native. We can also define cloud-native applications or services in terms of what they are not, including:
- App state or data in-process that is not cluster/network aware or sharded/replicated
- Long running, non-disposable, virtualized instances, or servers directly tied to hardware
- Monolithic architecture, not decomposed into separate units of scale or manageability
- Not comprised of contract-first APIs—tightly coupled components
- Tightly coupled to an individual operating system, not designed for virtualization
- Only fault tolerant at the OS level, if at all, no app level fault tolerance
- Waterfall, ITSM/ITIL and manual release management
- Config and routing/discovery at compile time, declared statically
- Non-aggregated logs/metric streams requiring SSHing into an individual VM to access
- Dependent on a specific server’s OS/configuration, whether virtualized or not
Clarifying cloud-native vs. cloud-based application development
Cloud native is the ultimate expression and union of organizational structure, processes, designs, and techniques used specifically to exploit distributed, service-oriented cloud architecture. There are waypoints on an application’s construction (or migration) journey where incremental value can be achieved—they are the stages of a cloud-native maturity model.
A cloud-enabled application is best summarized as a traditionally built monolithic application, likely with no internal domain-driven design. It has been minimally (re)designed to exploit the cloud. A good way to illustrate their technical differences with cloud-native applications is to examine what would need to be done to redesign such a traditional monolithic, an application not intended for deployment on a virtual machine or container.
Industries that rely on cloud-native applications
Media, telco, finance, high tech, government, healthcare, retail…the list goes on. You’ll find cloud-native applications anywhere medium-to-large scale businesses are going through digital transformation. Digital transformation and differentiation is not achieved by using the same software as others. It’s achieved by creating novel software that reflects your organization’s unique competence and value. This endeavor often involves large, custom codebases worked on concurrently by many people.
Of course, not every system or application an organization develops merits the level of investment to make it fully cloud native. Yet, it certainly has become popular for systems of customer engagement, as well as serving other national or global-scale business operations. Let’s look at a couple of examples.
Finding financial services differentiation with cloud technologies
Financial services and banking products are straightforward for a competitor to replicate. After all, the terms and rates on savings accounts, credit cards and mortgage loans can simply be matched. To stand out, smart finserv organizations are doubling down on providing the best experience possible for their customers. But, to do that, they need an infrastructure that can handle an absurdly high volume of data, consisting of all data types, and that can process it all blazingly fast. To get there, those leading the way are turning to cloud-based technologies.
Leading Australian financial services organization, Macquarie Group, is doing just that. They began using DataStax Enterprise (DSE) in 2017, as a central part of their journey to a cloud-based infrastructure— with the goal of providing the best customer experience a bank can create.
To support that, the focus has been on creating fast, unique, personalized products and experiences. Moving to a cloud-based data infrastructure enables Macquarie to understand the behavior of their customers and deliver personalized services to them in near real time. They are attempting to speed up every customer interaction, including traditionally slow processes like credit collection and onboarding. Their new environment also allows them to take advantage of a high volume of data in motion or “fast data”—information flowing between applications and IT systems that requires real-time engagement. According to Macquarie’s Chief Digital Officer, Luis Uguina, the company has, “probably an average of 12 million to 15 million data points coming through the system in real time.” Having an infrastructure in place to handle that heavy load seamlessly is the only path to delighting customers.
Another way to provide great customer experiences? Simply focus on it. Macquarie uses managed services for their cloud-based infrastructure, so they can direct their time, energy, and banking expertise toward adding value for their customers. As Uguina said, “We are a bank. Our mission is not to be updating databases, operating systems, playing with networks, or putting more and more metal every single day in the data center. Our mission as a bank is to deliver the most incredible customer experiences and the best financial products.” He added, “That’s why for us, not only the cloud, but fully managed services on the cloud, are the future for us as far as technology.”
Netflix’s journey to cloud native
In his Medium article, Cao Duc Nguyen examines Netflix’s eight-year journey to evolve their IT systems. As he explains, the infrastructure changes at Netflix began in 2008 after a service outage in its own data centers shut down all DVD renting services for three days. After realizing they needed a more reliable infrastructure with no single point of failure, Netflix took action. Two important decisions were made: using public cloud and microservices architecture.
Netflix was a pioneer in microservices architecture, which targets the problems of monolith software design. By encouraging separation of concerns in big programs, and breaking them down into smaller software components, software modularity can be achieved, with data encapsulation on its own. Microservices also leverage cloud-scaling automation at the virtual machine level, and can split workloads, as well as matching the right workload to the right virtual hardware. Netflix engineers were able to change any service much more quickly. Together with CI/CD deployment automation, this led to faster deployments. Netflix was also able to track the service performance, and quickly isolate issues between different services.
How companies benefit from building cloud-native applications
Developers are not the only ones who benefit from cloud-native applications. It's important to remember that the traditional notion of the "development team” has evolved. Developers are now often responsible for the operation of everything inside the container and higher up in the application layer.
So why all the fuss? What makes all this worth it for organizations? Key benefits revolve around achieving competitive advantage, quick national or global level rollouts, improved customer (self) service and satisfaction, lower CapEx costs, reduced risk, the business agility to react to changing market conditions, and better alignment of IT with business needs and timelines.
In general, changes like these have beneficial impact at multiple levels:
- Shifting CapEx to OpEx for both hardware and software
- Better utilized and lower-cost hardware resources
- Consumption or usage-based pricing models versus traditional licensing
- Unlock scalability and national/global deployment at any level
- More cost-efficient scaling, as the unit of scale isn’t the entire application
- More cost-efficient operation costs since virtual hardware can be better aligned with individual system components, instead of trying to service the entire application workload
- Observable applications, high-availability, multi-availability zone deployments, and automated infrastructure recovery (and often the application level improves system uptime, issue resolution time, and performance, with less manual intervention)
- Improved developer and ops team productivity, particularly with large-scale teams and/or codebases
- Ability to track changing business requirements more tightly by developing and delivering in short sprints
- Improved quality in the software delivered, from automated testing and release management process
Key factors to consider when building your app to run in the cloud
Gaining the full benefits of cloud-native applications depends on making the right personnel changes when reorganizing. And it also requires the right changes to methodologies and practices around development, operations, information security, and more.
Cloud-native vs on-prem vs. hybrid-cloud applications
With so much at stake, many organizations decide to turn to a cloud service provider to take on some or all aspects of their environment. When deciding between managed and self-managed models, knowing what your organization can, should, or wants to tackle itself versus outsourcing is essential. Options range from an on-premises infrastructure, where you use your own server and manage everything yourself, to using a software-as-a-service (SaaS) vendor that handles all aspects of your setup and hosts it in the cloud.
Most successful companies tend to focus above the value line on differentiating functionality. Of course, there are certainly reasons operating your own data center, platforms, and applications could make sense, including:
- Data sovereignty or privacy
- Regulation, legislation
- Deep integration with existing on-premises, self-managed systems
- Extending value of existing investments
- Leverage of existing team expertise and/or personnel
On-premises private cloud
Generally, this is when you have your own facilities and data centers, and your employees operate them. It's the least recommended model because it offers less benefit in speed to market, ease of operations, and ready-to-go compliance and security. You may very well employ an IaaS solution like VMware or OpenStack to reclaim some of those benefits, but at the end of the day, it’s your CapEx and employees at the helm.
Hosted private cloud
This refers to when the IaaS layer comes from the cloud service provider, but management of everything else built on top of it (middleware up to application) is managed by your employees. At the networking layer, for example, you will hear terms like virtual private cloud or private link, indicating the infrastructure you've rented is operated for you, but administered by your teams and is less automated.
This is the popular, central-cloud concept, where customers choose what level of infrastructure to have managed for them—IaaS, PaaS, FaaS, or SaaS. Services are provisioned and operated by an external entity and are available on-demand in a “pay for what you use” model.
This is a mix of all three models: self-managed on-premises private cloud, hosted private cloud, and public cloud. Interestingly, in order for this model to be effective, it’s not about how or where computing is done. This enables concepts like cloud bursting, where a private cloud might automatically or seamlessly receive additional resources from a public cloud source, in the event of a resource shortfall (or vice-versa).
The positive ripple effect of building cloud-native applications
Shifting from ITSM/ITIL to DevOps eliminates manual ticket-based processes, increasing automation. This same shift also lowers mean time to recovery or resolution (MMTR) as developers are incentivized to create observable systems, and be responsible for their application uptime. This allows operations staff to focus on infrastructure uptime. Shifting from waterfall to Agile development methodologies shortens feedback loops to the business. Adopting continuous integration and continuous delivery/deployment forces explicit definition of development team workflows. Frequent releases encourage and facilitate automated management, reducing risk. When coupled with automated testing, or test-driven development, bugs are surfaced earlier and software quality is improved. Yet, these elements have to be combined correctly to achieve organizational goals, such as increasing responsiveness to changing business requirements, accelerating delivery, improving quality, and reducing risk.