Category Archives: App Dev

8 Things You May Not Know About Docker

DockerIt’s possible that containers and container management tools like Docker will be the single most important thing to happen to the data center since the mainstream adoption of hardware virtualization in the 90s. In the past 12 months, the technology has matured beyond powering large-scale startups like Twitter and Airbnb and found its way into the data centers of major banks, retailers and even NASA. When I first heard about Docker a couple years ago, I started off as a skeptic. I blew it off as skillful marketing hype around an old concept of Linux containers. But after incorporating it successfully into several projects at Spantree I am now a convert. It has saved my team an enormous amount of time, money and headaches and has become the underpinning of our technical stack.

If you’re anything like me, you’re often time crunched and may not have a chance to check out every shiny new toy that blows up on Github overnight. So this article is an attempt to quickly impart 8 nuggets of wisdom that will help you understand what Docker is and why it’s useful.

 

Docker is a container management tool.

Docker is an engine designed to help you build, ship and execute applications stacks and services as lightweight, portable and isolated containers. The Docker engine sits directly on top of the host operating system. Its containers share the kernel and hardware of the host machine with roughly the same overhead as processes launched directly on the host machine.

But Docker itself isn’t a container system; it merely piggybacks off the existing container facilities baked into the OS, such as LXC on Linux. These container facilities have been baked into operating systems for many years, but Docker provides a much friendlier image management and deployment system for working with these features.

 

Docker is not a hardware virtualization engine.

When Docker was first released, many people compared it to virtualization hypervisors like VMware, KVM and Virtualbox. While Docker solves a lot of the same problems and shares many of the same advantages as hypervisors, Docker takes a very different approach. Virtual machines emulate hardware. In other words, when you launch a VM and run a program that hits disk, it’s generally talking to a “virtual” disk. When you run a CPU-intensive task, those CPU commands need to be translated to something the host CPU understands. All these abstractions come at a cost– two disk layers, two network layers, two processor schedulers, even two whole operating systems that need to be loaded into memory. These limitations typically mean you can only run a few virtual machines on a given piece of hardware before you start to see an unpleasant amount of overhead and churn. On the other hand, you can theoretically run hundreds of Docker containers on the same host machine without issue.

All that being said, containers aren’t a wholesale replacement for virtual machines. Virtual machines provide a tremendous amount of flexibility in areas where containers generally can’t. For example, if you want to run a Linux guest operating system on top of a Windows host, that’s where virtual machines shine.

 

Docker uses a layered file system.

As mentioned earlier, one of the key design goals for Docker is to provide image management on top of existing container technology. In Docker terms, an image is a static, immutable snapshot of a container’s file system. But Docker rather cleverly takes this snapshotting concept a step further by incorporating a copy-on-write filesystem into its design. I’ve found the best way to explain this is by example:

Let’s say you want to build a Docker image to run your Java web application. You may start with one of the official Docker base images that have Java 8 pre-installed. In your Dockerfile (a text file which tells Docker how to build your image) you’d specify that you’re extending the Java 8 image, which instructs Docker to pull down the pre-built snapshot associated with this image. Now, let’s say you execute a command that downloads, extracts and configures Apache Tomcat into /opt/tomcat. This command will not affect the state of original Java 8 image. Instead, it will start writing to a brand new filesystem layer. When a container boots up, it will merge these file systems together. It may load /usr/bin/java from one layer and /opt/tomcat/bin from another. In fact, every step in a Dockerfile produces a new filesystem layer, even if only one file is changed. If you’re familiar with the Git version control system, this is similar to a commit tree. But with Docker, it provides users with tremendous flexibility to compose application stacks iteratively.

At Spantree, we have a base image with Tomcat pre-installed and on each application release we merely copy the latest deployable asset into a new image, tagging the Docker image to match the release version as well. Since the only variation on these images is the very last layer, a 90MB WAR file in our case, each image is able to share the same ancestors on disk. This means we can keep our old images around and rollback on-demand with very little added cost. Furthermore, when we launch several instances of these applications side-by-side, they share the same read-only filesystems.

 

Docker can save you time.

Many years ago, I was working on a project for a major restaurant chain and on the first day I was handed a 12 page Word document describing how to get my development environment set up to develop against all the various applications. I had to install a local Oracle database, a specific version of the Java runtime, along with a number of other system and library dependencies and tooling. The whole setup process cost each member of my team approximately a day of productivity, which unfortunately translated to thousands of dollars in sunk costs for our client. Our client was used to this and considered this part of the cost of doing business when onboarding new team members, but as consultants we would have much rather spent that time building useful features that add value to our client’s business.

Had Docker existed at the time, we could have cut this process from a day to mere minutes. With Docker, you can express servers and services through code, similarly to configuration tools like Puppet, Chef, Salt and Ansible. But, unlike these tools, Docker goes a step further by actually pre-executing these steps for you during its build process snapshotting the output as an indexed, shareable disk image. Need to compile Node.js from source? No problem. The Docker runtime will do that on build and simply snapshot the output for you at the end. Furthermore, because Docker containers sit directly on top of the Linux kernel, there’s no risk of environmental variations getting in the way.

Nowadays, when we bring a new team member into a client project, they merely have to run `Docker-compose up`, grab a cup of coffee and by the time they’re back they should have everything they need to start working.

 

Docker can save you money.

Of course, time is money, but Docker can also save you hard, physical dollars as it relates to infrastructure costs. Studies at Gartner and McKinsey cite the average data center utilization as between 6% to 12%. Quite a lot of that underutilized space is due to static partitioning. With physical machines or even hypervisors, you need to defensively provision the CPU, disk and memory based on the high watermark of possible usage. Containers, on the other hand, allow you to share unused memory and disk between instances. This allows you to pack many more services onto the same hardware, spinning them down when they’re not needed without worrying about the cost of bringing them back up again. If it’s 3am and no one is hitting your Dockerized intranet application but you need a little extra horsepower for your Dockerized nightly batch job, you can simply swap some resources between the two applications running on common infrastructure.

 

Docker has a robust ecosystem of existing images.

At the time of writing, there are over 14,000 public Docker images available on the web. Most of these images are shared through Docker Hub. Similar to how Github has largely become the home of most major open-source projects, Docker Hub is the de facto resource for sharing and working with public Docker images. These images can serve as building blocks for your application or database services. Want to test drive the latest version of that hot new graph database you’ve been hearing about? Someone’s probably already gone to the trouble of Dockerizing it. Need to build and host a simple Rails application with a special version of Ruby? It’s now at your fingertips in a single command.

 

Docker helps you avoid production bugs.

At Spantree, we’re big fans of “immutable infrastructure.” That is to say, if at all possible, we avoid doing upgrades or changes on live servers at all costs. Instead, we build out new servers from scratch, applying the new application code directly to a pristine image and rolling the new release servers into the load balancer when they’re ready, retiring the old server instances after all our health checks pass. This gives us the ability to cleanly roll back if something goes wrong. It also gives us the ability to promote the same master images from dev to QA to production with no risk of configuration drift. By extending this approach all the way to the developer machine with Docker, we can also avoid the “it works on my machine” problem because each developer is able to test their build locally in a parallel

 

Docker only works on Linux (for now).

The technologies powering Docker are not necessarily new, but many of them, like LXC and cgroups, are specific to the Linux kernel. This means that, at the time of writing, Docker is only capable of hosting applications and services that can run on Linux.  That is likely to change in the coming years as Microsoft has recently announced plans for first-class container support in the next version of Windows Server. Microsoft has been working closely with Docker to achieve this goal. In the meantime, tools like boot2docker and Docker Machine make it possible to run and proxy Docker commands to a lightweight Linux VM on Mac and Windows environments.

Have you used Docker? What has your experience been like? If you’re interested in learning more about how Spantree and GreenPages can help with your application development initiatives, please reach out!

 

By Cedric Hurst, Principal, Spantree Technology Group, LLC

Cedric Hurst is Principal at Spantree Technology Group, a boutique software engineering firm based primarily out of Chicago, Illinois that focuses on delivering scalable, high-quality solutions for the web. Spantree provides clients throughout North America with deep insights, strategies and development around cloud computing, devops and infrastructure automation. Spantree is partnered with GreenPages to provide high-value application development and devops enablement to their growing enterprise client base. In his spare time, Cedric speaks at technical meetups, makes music, mentors students and hangs out with his daughter. To stay up to date with Spantree, follow them on Twitter @spantreellc

CIO Focus Interview: Kevin Hall, GreenPages-LogicsOne

CIO Focus InterviewFor this segment of our CIO Focus Interview Series, I sat down with our CIO and Managing Director, Kevin Hall. Kevin has an extremely unique perspective as he serves as GreenPages’ CIO as well as the Managing Director of our customer facing Professional Services and Managed Services divisions.

 

Ben: Can you give me some background on your IT experience?

Kevin: I’ve been a CIO for 17+ years holding roles in both consulting organizations and roles overseeing internal IT. The position I have at GreenPages is very interesting because I am both a Managing Partner of our services business and the CIO of our organization. This is the first time I have held both jobs at the same time in one company

Ben: What are your primary responsibilities for each part of your role then?

Kevin: As CIO, I’m responsible for all aspects of information services. This includes both traditional data center functions, engineering functions, operations functions, and app dev functions. As Managing Director I am responsible for our Professional Services and Managed Services divisions. These divisions provide help to our customers on the same sorts of projects that I am undertaking as CIO.

Ben: Does it help you being in this unique position? Does it allow you to get a better understanding of what GreenPages’ customers are looking for since you experience the same challenges as CIO?

Kevin: Yes, I think it is definitely an advantage. The CIO role is crucial in this era. It has certainly been a challenging job for a long time, and that has magnified in recent years because of the fundamental shift and explosion of the capabilities available to modern day CIOs. Because I am in this rather crazy position, it does help me understand the needs of our customers better. If I was just on the consulting side of the house, I’m not sure I could fully understand or appreciate how difficult some of the choices CIOs are faced with are. I can relate to that feeling of being blocked or trapped because I’ve experienced it. The good news is our CTO and Architects provide real world lessons right here at home for both myself and our IT Director.

Interestingly enough, on the services side of my role, in both the Professional Services and Managed Services division, we are entering our 3rd year of effort to realign those divisions in a way that helps CIOs solve those same demanding needs that I am facing. We’re currently helping companies with pure cloud, hybrid cloud and traditional approaches to information services. I’m both a provider of our services to other organizations as well as a customer of those services. Our internal IT team is actually a customer of our Professional and Managed Services division. We use our CMaaS platform to manage and operate our computing platforms internally. We also use the same help desk team our customers do. Furthermore, we use various architects and engineers that serve our customers to help us with internal projects. For example, we have recently engaged our client-facing App Dev team to help GreenPages reimagine our internal Business Intelligence systems and are underway on developing our next generation BI tools and capabilities. Another example would be a project we recently completed to look at our networking and security infrastructure in order to be prepared to move workloads from on-prem or colo facilities to the cloud. We had to add additional capabilities to our network and went out and got the SOC 2 Type 2 certification which really speaks to the importance we place on security. What I love about working here is that we don’t just talk about hybrid cloud; we are actively and successfully using those models for our own business.

Ben: What are some of your goals for 2015?

Kevin: On the internal IT side, I’m engaged, like many of my colleagues around the globe, on assessing what the new computing paradigm means for our organization. We’re embarked in looking at every aspect of our environment along with our ability to deliver services to the GreenPages’ organization. Our goal is to figure out a way to do this in a cost effective, scalable, and flexible way that meets the needs of our organization.

Ben: Any interesting projects you have going on right now?

Kevin: As we assess our workloads and start trying to understand what the best execution venues for those workloads are, it’s become pretty clear that we are going to be using more than a single venue. For example, one big execution venue for us is VMware’s vCloud Air. We have some workloads that are excellent candidates for that venue. Other workloads are great fits for Microsoft Azure. We have some initiatives, like the BI project, that are going to be an open source project. We’ll be utilizing things like Docker and Hadoop that are most likely going to be highly optimized around Amazon’s capabilities. This is giving me insight into the notion that there are many different capabilities between clouds. The important thing is to make sure every workload is optimized for the right cloud. This is an important ongoing exercise for us in 2015.

Ben: Which area of IT would you say interests you the most?

Kevin: What interests me most about IT is the organizational aspect. How do you organize in a way that creates value for the company? How do you prioritize in terms of people, process and technology? For me, it’s not about one particular aspect; it’s about the entire program and how it all functions.

Ben: What are you looking forward to in 2015 from a technology perspective?

Kevin: I’m really looking forward to our annual Summit event in August. I think it is going to be the best one yet. If you look back several years ago, very few attendees raised their hand when asked if they thought the cloud was real. Last year, most of the hands in the room went up. What will make it especially interesting this year is that we have many customers deeply involved with these types of projects. Four years ago the only option was to sit and listen to presentations, but now our customers will have the opportunity to talk to their peers about how they are actually going about doing cloud. It will be a great event and a fantastic learning opportunity.

Are you looking for more information around the transformation of corporate IT? Download this eBook from our Director of Cloud Services John Dixon to learn more!

 

By Ben Stephenson, Emerging Media Specialist

Top 25 Findings from Giagom’s 4th Annual “Future of Cloud Computing” Survey

By Ben Stephenson, Journey to the Cloud

 

Giagom Research and North Bridge Partners recently released their 4th annual “Future of Cloud Computing” study. There was some great data gathered from the 1,358 respondents surveyed. In case you don’t have time to click through the entire 124 slideshare deck, I’ve pulled out what I think are the 25 most interesting statistics from the study. Here’s the complete deck if you would like to review in more detail.

 

  • 49% using the cloud for revenue generating or product development activities (Slide 9)
  • 80% of IT budget is used to maintain current systems (Slide 20) <–> GreenPages actually held a webinar recently explaining how organizations can avoid spending the majority of their IT budgets on “keeping the lights on
  • For IT across all functions tested in the survey, 60-85% of respondents will move some or significant processing to the cloud in the next 12-24 months (Slide 21)
  • Shifting CapEx to OpEx is more important for companies with over 5,000 employees (Slide 27)
  • For respondents moving workloads to the cloud today, 27% said they are motivated to do so because they believe using a cloud platform service will help them lower their capital expenditures (Slide 28)
  • Top Inhibitor: Security, remains the biggest concern, despite declining slightly last year, it rose again as an issue in 2014 and was cited by 49% of respondents (Slide 55)
  • Privacy is of growing importance. As an inhibitor, Privacy grew from 25% in 2011 to 31% (Slide 57)
  • Over 1/3 see regulatory/compliance as an inhibitor to moving to the cloud (Slide 60)
  • Interoperability concerns dropped by 45%, relatively, over the past two years…but 29% are still concerned about lock in (Slide 62)
  • Nearly ¼ people still think network bandwidth is an inhibitor (Slide 64)
  • Reliability concerns dropped by half since 2011 (Slide 66)
  • Amazon S3 holds trillions of objects and regularly peaks at 1.5 million requests per second (Slide 71)
  • 90% of world’s data was created in past two years…80% of it is unstructured (Slide 73) <–> Here’s a video blog where Journey to the Cloud blogger Randy Weis talks about big data in more detail
  • Approximately 66% of data is in the cloud today (Slide 74)
  • The number above is expected to grow 73% in two years (Slide 75)
  • 50% of enterprise customers will purchase as much storage in 2014 as they have accumulated in their ENTIRE history (slide 77)
  • IaaS use has jumped from 11% in 2011 to 56% in 2014 & SaaS has increased from 13% in 2011 to 72% in 2014 (Slide 81)
  • Applications Development growing 50% (Slide 84) <–> with the growth of app dev, we’re also seeing the growth of shadow IT. Check out this on-demand webinar “The Rise of Unauthorized AWS Use. How to Address Risks Created by Shadow IT.”
  • PaaS approaching the tipping point! PaaS has increased from 7% in 20111 to 41% in 2014. (Slide 85) <–> See what one of our bloggers, John Dixon, predicted in regards to the rise of PaaS at the beginning of the year.
  • Database as a Service expected to nearly double, from 23% to 44% among users (Slide 86)
  • By 2017, nearly 2/3rds of all workloads will be processed in cloud data centers. Growth of workloads in cloud data centers is expected to be five times the growth in traditional workloads between 2012 and 2017. (Slide 87)
  • SDN usage will grow among business users almost threefold…from 11% to 30%  (Slide 89) <–> Check out this video blog where Nick Phelps talks about the business drivers behind SDN.
  • 42% use hybrid cloud now (Slide 93)
  • That 42% will grow to 55% in 2 years (Slide 94) <–> This whitepaper gives a nice breakdown of the future of hybrid cloud management.
  • “This second cloud front will be an order of magnitude bigger than the first cloud front.” (Slide 117). <–> hmmm, where have I heard this one before? Oh, that’s right, GreenPages’ CEO Ron Dupler has been saying it for about two years now.

Definitely some pretty interesting takeaways from this study. What are your thoughts? Did certain findings surprise you?