Welcome!

@DevOpsSummit Authors: Liz McMillan, Elizabeth White, Pat Romanski, Jyoti Bansal, Yeshim Deniz

Related Topics: @DevOpsSummit, Microservices Expo, Containers Expo Blog

@DevOpsSummit: Article

Sharding for Scale | @DevOpsSummit #BigData #DevOps #Microservices

Where you decide to shard for scalability impacts the complexity of the entire architecture

Sharding for Scale: In the App or in the Network?

Sharding has become a popular means of achieving scalability in application architectures in which read/write data separation is not only possible, but desirable to achieve new heights of concurrency. The premise is that by splitting up read and write duties, it is possible to get better overall performance at the cost of a slight delay in consistency. That is, it takes a bit of time to replicate changes initiated by a "write" to the read-only master database. It's eventually consistent, and it's generally considered an acceptable trade off when searching for higher and higher scalability.

While the most well-known cases of read/write separation and sharding are based on geography - east coast versus west coast, for example - there are other cases where localized sharding has also been put into play with great success. Generally these types of architectures base their sharding decisions on user names, splitting them up between databases based on statistical analysis of occurrences. This achieves greater scalability at the data layer by better distributing the rate at which writes (which are generally speaking a blocking action) occur, and thus achieving greater scale and concurrency for only a slight period of inconsistency.

The mechanism is, in theory, quite simple and is loosely based on an algorithmic principle taught in most computer science algorithm classes: hashing. Basically when a request comes in, the application looks at some piece of data - like the user name - and based on that data decides which of X databases to send it to. How that division is determined is not as relevant (to this discussion, anyway) as the action itself. It's like registration at an event or in college where you're split up based on the first letter of your last name. You remember, every one whose name starts with A-G go in this line, you others go over there, in those lines.

That's sharding. And it's most commonly implemented in the application, where the connection to a database is created and used to manage the data that is the lifeblood of every application and business today.

Now, I told you all that to share another approach to sharding; one that takes advantage of programmability in the network (data path programmability, to be precise).

Shard in the Network or the App?
In the first scenario, in which sharding occurs in the application, there's almost certainly (I'd be willing to bet real live money) a load balancing service in front. It's distributing requests to a pool (cluster) of application instances, each of which individually decides which database to talk to given the data available. If we insert some intelligence - some programmability - into that load balancing service, we move the sharding decision in front of the application.

Now when a request comes in, the load balancing service examines the data available and decides to which application instance the request should be sent. The data is likely the same - a user identity - but may be something more applicable to the service, say a product name or number extricated from the URL of a RESTful API fronting a microservice.

sharding - network versus application

Basically what you're doing is taking the block of code responsible for sharding from inside the application and moving it into the network.

In the diagram to your left (or at least it was on the left when I wrote this) illustrates. The reason the "apps" in the "network" illustration are colored is to highlight that each of them is dedicated to a specific database. The code - the app itself - is all the same. There's no difference except for the configuration that tells it "you are dedicated to database A-G".

This is starkly different from the "in the application" sharding example in which all instances are exactly the same, including the configuration, as each one may be at any time talking to any one of the databases, depending on the data received.

Now, I believe it's obvious (because I colored all the database connections) that when you shard in the application, the complexity of the network is pretty high, as well as the load on each database as it has to maintain connections with each and every application instance. Operational Axiom #2 tells us "as load increases, performance decreases" so it's likely we're seeing an impact on performance (in a negative way) from a sharding in the application architectural approach.

Conversely, the network complexity in the sharding in the network approach is fairly low (and straight forward) and actually simplifies the entire architecture. The load on the databases themselves remains lower because there is only one instance (or pool of instances) with which it needs to manage connections.

The negative of the "in the network" approach is that you have another component (service) that must be managed - that means application lifecycle management applies -  and there are likely separate configurations necessary for each of the pools (clusters) responsible for scaling out each application instance (because each pool only talks to one of the databases). But this negative also means that the code responsible for sharding is localized, it is itself a "network microservice" that is small and isolated, meaning it can be tweaked independently of the application code. That's a positive, especially if there might be a need to increase the level of sharding or change its core mechanism required to scale the application. That's one of the reasons microservices are growing more popular; the localization and isolation ensure the ability to change without disruptive impact on other services or applications.

Taking advantage of programmability in the network to achieve new levels of scalability while simplifying your architecture is another reason programmability in the network is an invaluable tool in your architectural toolbox.

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.

@DevOpsSummit Stories
Did you know that you can develop for mainframes in Java? Or that the testing and deployment can be automated across mobile to mainframe? In his session at @DevOpsSummit at 20th Cloud Expo, Vaughn Marshall, Sr. Principal Product Owner at CA Technologies, will discuss and demo how increasingly teams are developing with agile methodologies using modern development environments and automating testing and deployments, mobile to mainframe.
The goal of Continuous Testing is to shift testing left to find defects earlier and release software faster. This can be achieved by integrating a set of open source functional and performance testing tools in the early stages of your software delivery lifecycle. There is one process that binds all application delivery stages together into one well-orchestrated machine: Continuous Testing. Continuous Testing is the conveyor belt between the Software Factory and production stages. Artifacts are moved from one stage to the next only after they have been tested and approved to continue. New code submitted to the repository is tested upon commit. When tests fail, the code is rejected. Subsystems are approved as part of periodic builds on their way to the delivery stage, where the system is being tested as production ready. The release process stops when tests fail. The key is to shift test ...
SYS-CON Events announced today that Juniper Networks (NYSE: JNPR), an industry leader in automated, scalable and secure networks, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Juniper Networks challenges the status quo with products, solutions and services that transform the economics of networking. The company co-innovates with customers and partners to deliver automated, scalable and secure networks with agility, performance and value.
SYS-CON Events announced today that CA Technologies has been named "Platinum Sponsor" of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, New York, and 21st International Cloud Expo, which will take place in November in Silicon Valley, California.
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In his Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, will explore the ways that Nutanix technologies empower teams to react faster than ever before and connect teams in ways that were either too complex or simply impossible with traditional infrastructures.
As DevOps methodologies expand their reach across the enterprise, organizations face the daunting challenge of adapting related cloud strategies to ensure optimal alignment, from managing complexity to ensuring proper governance. How can culture, automation, legacy apps and even budget be reexamined to enable this ongoing shift within the modern software factory?
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists will examine how DevOps helps to meet the demands of Digital Transformation – including accelerating application delivery, closing feedback loops, enabling multi-channel delivery, empowering collaborative decisions, improving user experience, and ultimately meeting (and exceeding) business goals.
SYS-CON Events announced today that Hitachi, the leading provider the Internet of Things and Digital Transformation, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Hitachi Data Systems, a wholly owned subsidiary of Hitachi, Ltd., offers an integrated portfolio of services and solutions that enable digital transformation through enhanced data management, governance, mobility and analytics. We help global organizations open new revenue streams, increase efficiencies, improve customer experience and ensure rapid time to market in the digital age. Only Hitachi Data Systems powers the digital enterprise by integrating the best information technology and operational technology from across the Hitachi family of companies. We combine this experience with Hitachi expertise in the internet of things to d...
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTred processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
Cloud promises the agility required by today’s digital businesses. As organizations adopt cloud based infrastructures and services, their IT resources become increasingly dynamic and hybrid in nature. Managing these require modern IT operations and tools. In his session at 20th Cloud Expo, Raj Sundaram, Senior Principal Product Manager at CA Technologies, will discuss how to modernize your IT operations in order to proactively manage your hybrid cloud and IT environments. He will be sharing best practices around collaboration, monitoring, configuration and analytics that will help you boost experience and optimize utilization of your modern IT Infrastructures.
While some vendors scramble to create and sell you a fancy solution for monitoring your spanking new Amazon Lambdas, hear how you can do it on the cheap using just built-in Java APIs yourself. By exploiting a little-known fact that Lambdas aren’t exactly single threaded, you can effectively identify hot spots in your serverless code. In his session at 20th Cloud Expo, David Martin, Principal Product Owner at CA Technologies, will give a live demonstration and code walkthrough, showing how to overcome the challenges of monitoring S3 and RDS. This presentation will provide an overview of necessary Amazon Lambda concepts and discuss how to integrate the monitoring data with other tools.
SYS-CON Events announced today that Hitachi, the leading provider the Internet of Things and Digital Transformation, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Hitachi Data Systems, a wholly owned subsidiary of Hitachi, Ltd., offers an integrated portfolio of services and solutions that enable digital transformation through enhanced data management, governance, mobility and analytics. We help global organizations open new revenue streams, increase efficiencies, improve customer experience and ensure rapid time to market in the digital age. Only Hitachi Data Systems powers the digital enterprise by integrating the best information technology and operational technology from across the Hitachi family of companies. We combine this experience with Hitachi expertise in the internet of things to d...
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker containers gain prominence. He explored these challenges and how to address them, while considering how containers will influence the direction of cloud computing.
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
Cloud Expo, Inc. has announced today that Aruna Ravichandran, vice president of DevOps Product and Solutions Marketing at CA Technologies, has been named co-conference chair of DevOps at Cloud Expo 2017. The @DevOpsSummit at Cloud Expo New York will take place on June 6-8, 2017, at the Javits Center in New York City, New York, and @DevOpsSummit at Cloud Expo Silicon Valley will take place Oct. 31-Nov. 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
@DevOpsSummit at Cloud taking place June 6-8, 2017, at Javits Center, New York City, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.
Translating agile methodology into real-world best practices within the modern software factory has driven widespread DevOps adoption, yet much work remains to expand workflows and tooling across the enterprise. As models evolve from pockets of experimentation into wholescale organizational reinvention, practitioners find themselves challenged to incorporate the culture and architecture necessary to support DevOps at scale. In his session at @DevOpsSummit at 20th Cloud Expo, Anand Akela, Senior Director of DevOps Solutions at CA Technologies, will discuss how existing adopters are employing unified agile and DevOps techniques to engage functional processes and toolchains that deliver increased software quality, faster time-to-market and measurably improved customer experience.
Five years ago development was seen as a dead-end career, now it’s anything but – with an explosion in mobile and IoT initiatives increasing the demand for skilled engineers. But apart from having a ready supply of great coders, what constitutes true ‘DevOps Royalty’? It’ll be the ability to craft resilient architectures, supportability, security everywhere across the software lifecycle. In his keynote at @DevOpsSummit at 20th Cloud Expo, Jeffrey Scheaffer, GM and SVP, Continuous Delivery Business Unit at CA Technologies, will share his vision about the true ‘DevOps Royalty’ and how it will take a new breed of digital cloud craftsman, architecting new platforms with a new set of tools to achieve it. He will also present a number of important insights and findings from a recent cloud and DevOps study – outlining the synergies high performance teams are exploiting to gain significant busin...
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on quality and value.
Everyone wants to use containers, but monitoring containers is hard. New ephemeral architecture introduces new challenges in how monitoring tools need to monitor and visualize containers, so your team can make sense of everything. In his session at @DevOpsSummit, David Gildeh, co-founder and CEO of Outlyer, will go through the challenges and show there is light at the end of the tunnel if you use the right tools and understand what you need to be monitoring to successfully use containers in your environments.
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in compute, storage and networking technologies, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and Embedded Systems worldwide. Supermicro is committed to protecting the environment through its “We Keep IT Green®” initiative and provides customers with the most energy-efficient, environmentally friendly solutions available on the market.
NHK, Japan Broadcasting, will feature the upcoming @ThingsExpo Silicon Valley in a special 'Internet of Things' and smart technology documentary that will be filmed on the expo floor between November 3 to 5, 2015, in Santa Clara. NHK is the sole public TV network in Japan equivalent to the BBC in the UK and the largest in Asia with many award-winning science and technology programs. Japanese TV is producing a documentary about IoT and Smart technology and will be covering @ThingsExpo Silicon Valley. The program, to be aired during the peak viewership season of the year, will have a major impact on the industry in Japan. The film's director is writing a scenario to fit in the story in the next few days will be turned in to the network.
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor – all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organizations must focus on what is most relevant to deliver value, reduce IT complexity, create more repeatable agile-based processes and leverage increasingly secure and stable, cloud-based infrastructure platforms.
The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
Join IBM November 2 at 19th Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how to go beyond multi-speed it to bring agility to traditional enterprise applications. Technology innovation is the driving force behind modern business and enterprises must respond by increasing the speed and efficiency of software delivery. The challenge is that existing enterprise applications are expensive to develop and difficult to modernize. This often results in what Gartner calls "Bimodal IT," where business struggle to apply modern tools and practices to traditional monolithic applications. But these existing assets can be modernized and made more efficient without having to be completely overhauled. By leveraging methodologies like DevOps and agile, alongside emerging technologies like cloud-native services and containerization, traditional applications and teams can be ...