A lot has changed with respect to software infrastructure in the 13 years since I started working with it professionally. I was hired right out of college as a junior developer by a company that spent most of its history online. The power of the internet as a platform for sales was apparent as the web business flourished as did the need to continually adjust the processing power required to support it.
The commerce site started off as a simple static application that augmented a catalog mailer. Orders were collecting via submitted forms from the website or through (snail) mail in order forms and were hand managed on the fulfillment side. The web server was a personally owned, co-located server, hosted at a local ISP. Management of the server was not an easy task. Not only did the application software need to be updated periodically, the OS, drivers, kernel updates and other supporting libraries needed to be continually/routinely patched to deal with security issues and supporting new functionality. If for some reason you lost connectivity to the server, either a dial in backdoor or on-site visit was required. Hardware issues/upgrades had to be addressed on-site as well and in the case of a glitch, they proved to be expensive to track down and repair.
As the company grew, so did the infrastructure and technologies that were needed. An array of servers was purchased and located on-premise, requiring additional operation teams to manage the hardware not to mention cost in electricity and high-speed internet access. A new dynamic software framework was developed utilizing the LAMP stack, giving us more control of the site content. This, however, added more demand for hardware needs, requiring the deployment of individually scalable load balanced applications.
Eventually the company grew beyond the ability to handle the amount of traffic they were able to support on-premise. A decision was made to move to a 3rd party managed datacenter utilizing virtualization of hardware. Essentially this meant sharing of resources with other tenants, which is a good way to keep costs down by spreading the upkeep of system hardware and software around. All management of underlying hardware and software was handled by the provider. Provisioning was easy. Simply request specific hardware criteria and wait as deployments of the servers were carried out, manifesting as virtual machines. It was easy for the provider to “spin up” a server with a minimal amount of effort and time.
Soon after this infrastructure change was made, I decided to leave the company for a better opportunity. At the time, there was still a large level of environment curation, very little automation occurring and a small army of dev-ops team members that maintained the state of the environment. As changes were needed, these team members would act on requests to expand/reduce the footprint of the system to support said needs.
The introduction of cloud-based IaaS providers extends the opportunity to reduce the infrastructural footprint for any organization. With respect to Aaxis Digital, a decision was made to adopt a cloud-based strategy at an early stage. We have worked with many of the top tier cloud providers such as Amazon Web Services (AWS), Microsoft Azure Cloud, Alibaba Cloud and Google Cloud Platform (GCP) to name a few. Although Aaxis Digital selects a cloud provider based on specific client needs and is platform indiscriminate, the specific examples in this document reference Google Cloud Platform features, primarily due to my personal experience with the environment and not an endorsement by any means.
If you have not yet been introduced to cloud computing and/or cloud based providers (inundated by ads, pummeling all facets of media), we can quickly define the reason why they were conceived and later, point out some specific benefits that they can offer right out of the box. As mentioned above, I will use Google Cloud Platform as an example, however, you may prefer another cloud provider and most of the service offerings are commonly available with all of them, with slightly different distinctions.
Essentially, a cloud provider is the next step in the historical evolution of infrastructure described at the beginning of this article. They address some of the primary limitations of the previous solutions used by organizations, including the one I described in the timeline above.
The five primary high-level cloud platform benefits are the following:
- On Demand, Self Service Resources – All of the resources that are offered by a cloud provider are directly available for a customer to manage and can be created/edited or destroyed with no “human” interaction by the provider themselves. The resources can be managed through multiple different interfaces and even automated through tools like Terraform (infrastructure as code) and helm (kubernetes package management as code).
- Robust/Broad Network Access – Regardless of where a request originates from in the world, there should be a localized edge device to allow ingress into your cloud VPC (Virtual Private Cloud) with minimal latency. Resources can also be located to correlate with the majority of your traffic as well as exist in multiple regions/zones if your customer base is geographically broad. This could drastically improve performance.
- Resource Pooling – Since all offerings in the cloud are based on virtualized hardware, resources are shared by all users of the cloud infrastructure. Similar to the virtualized datacenter as I mentioned earlier, the resource costs are shared, but in a much larger scale. This allows the cloud provider to be more agile with regards to keeping the platform up to date from a hardware and software standpoint.
- Rapid Elasticity – With on-demand computing, instance groups, kubernetes and cloud-based routing it is easy to setup a platform that can scale either manually or based on events that can be defined in your workloads. For example, you can create monitoring alerts to notify you that a specific application is not able to meet the needs of the consumers. With this information, you can choose to increase your resources with simple adjustments of the overall application config. You can also have the cloud provider monitor your application and make dynamic adjustments based on rules you create from system or event-based notifications. These dynamic changes can be constrained by budgetary and explicit sizing definitions that you specify.
- Measured Service – You only pay for the resources you consume and only when they are active/leveraged. This comes into play with not only the IaaS offerings but also the SaaS offerings. Often, the more you use, the more you save in the pricing models.
Google Cloud Platform (GCP) – An Example
As an example, let’s consider the specific features that are offered by Google Cloud Platform (GCP). Google provides four main offerings: Compute, Storage, Big Data, y Machine Learning.
Cloud Compute
Compute provides the Windows or Linux virtual machines that your software can run on. GCP offers several pre-baked OS images for you to choose from or you can create your own images for reuse. As with most cloud providers, you can specify the number of CPUs and memory your instance requires and even the architecture of the chipset. There are optional pre-defined instance configurations supplied to you as a selection based on the application signature needs (heavy memory, heavy IO, heavy CPU for example).
There is full control over specifying disk type which can be persistent SSD/HDD or simple ephemeral local SSD. You can also configure the instances with a startup script that can preload applications/libraries required to support your running software. Templatization of your instance configuration eases deployment of identical server profiles and also enables the use of instance groups, which deploy and manage one or more of the same servers in a group, which is critical for failover and high availability needs.
If you would rather work with containers and possibly container orchestration, GCP offers a few options to choose from.
- Google Cloud Functions provides a node-based cluster that you can deploy small, single purpose, quick run applications that are “triggered” by events. This would be good for stateless services with minimal overhead and quick runtimes. The cost is low because they change per function invocation, not up-time and they are fully autoscaling.
- Google App Engine is a serverless container framework that allows for several “auto-detected” programming languages. You simply write your code and deploy it. There is an option to persist data as well which makes it more flexible than cloud functions.
- Google Kubernetes Engine (GKE) is the most flexible, yet complex container deployment/orchestration platform offered. The clusters are built on top of Compute VMs and simply share the same pricing strategy. Workloads can be deployed directly to GKE and monitored with various tools provided by Google. You can leverage helm to manage workloads as well which enforces configuration in a declarative manner.
Cloud Storage
Storage is a suite of GCP SaaS solutions which can provide your applications with several distinctive types of persistence. All of the storage options are managed by GCP and scalable to your application needs automatically. Security is baked in with encryption end to end and full identity management controls for heightened security needs. A few examples are:
- Google Cloud Storage (GCS) is a RESTful object storage mechanism that offers locational based storage, storage classifications to control availability and pricing (Standard VS Archive for instance) and object lifecycle management. Perfect for use when persisting media, documents and other large object types.
- Google Cloud SQL/Spanner/BigQuery are relational database options provided. SQL gives you the choice of MySQL or Postgres as a basis. Spanner is a high availability, horizontally scalable, highly consistent, proprietary SQL database. It’s a great choice if you are expecting a massive dataset with high I/O.
- BigQuery is a data warehouse appliance that can see used for reporting or analytics.
- Google Cloud Firestore/Bigtable are the no-SQL offerings with Firestore more designed for smaller structured objects like session/profile or state data.
Big Data
Big Data can be used to capture, process and analyze large amount of system generated data to use for reporting/analysis/personalization or just about anything you want. There are several tools that can be used with big data to transport and enrich your dataset.
Cloud Dataproc can be used to run calculations on large sets of data by a cluster of workers. Good for map and reduce activities.
Cloud Dataflow is a message stream processing mechanism that can be used on an unpredictable load of data. The primary benefit of this tool is to de-dup, order and augment data flowing through the system before it’s eventually persisted.
Cloud PubSub is a “at least once” asynchronous messaging system. Great as a transport for various streams of string data such as events and logging.
Machine Learning
Machine Learning encapsulates the functionality offered by Google that leverages the massive amount of data that passes through their system every day. This includes natural language processing, language translation and image and video context analysis, which are some critical aspects to most modern websites today.
There is a large quantity of additional offerings that cloud platforms make available to you. Caching, monitoring, access control down to the object level, just to name a few. On top of that, if there is something you need that is not directly available in the cloud offering, you can always browse robust Marketplaces where pre-built images are available on demand and generally for free (resource costs do apply).
Concluding Remarks
Regardless of your organizational needs with respect to computing, adopting a cloud platform will provide you with a flexible infrastructure, loaded with large amounts of functional offerings, at a low shared cost. The burden of maintaining an IT team to manage physical hardware would be a thing of the past and concerns like security are baked right into the framework. The time to move to the cloud is now. Please feel free to comment or drop me a note at AAXIS.