@mtldo + @cfortier at #chefconf2014
At ChefConf 2014 this year, Chris Fortier and I had the privilege of presenting on the challenges of moving from a physical data center to the cloud. Beyond the move, we had to move towards more automation and a hands off approach to managing servers. This meant learning Amazon Web Services in depth and getting Chef onto every one of our machines. The result of our work was a library of cookbooks that could reliably work in three distinct locations: physical servers in Rackspace, laptops in our SoHo office, and cloud instances in AWS. As we developed these cookbooks we gradually improved our process and testing techniques. We reached a flow that kept cookbooks tested and trustworthy no matter where we launched. This also gave the whole team visibility into system changes that would have been easily missed otherwise.
Read more →
Clase Premier / Scarlett Johansson by César Moreno on Behance
Here at Behance, we deploy a lot of code, very VERY frequently. We are constantly adding new features to our applications, hotfixing bugs, and changing things to give everyone a better user experience.
Currently, we have 3 operations engineers who have the credentials to build our applications to our pre-production and production environments ( Myself, Ko Uchiyama, and Chris Henry ). If none of us are available for whatever reason, changes don’t get pushed. At the same time, if we ARE available, on a really bad deployment day we can receive anywhere from 12-30 requests to push changes which can REALLY interrupt our work flow or weekend ( Especially mine since I’ve become the “main build engineer guy” lately). How can we make this workflow better?
Build a robot to do it for you! Duh.
Read more →
Behance Distributed Logging Application - Log View
What do you do when there is an error across 7 different web applications running in production and load balanced across roughly 200+ servers? Good luck logging into each and grepping some logs. No, crying won’t help; we tried that. Instead, why not build a tool to log all of those errors into a centralized location? After evaluating the many (fantastic) pre-built options like: Loggly, Splunk, even Syslog… we found none of them provided all the capabilities we wanted. The job to build our own solution was then tasked to Matt LeBrun and me, and here is our awesome journey.
Read more →
The Behance Network uses a number of nifty little binary files to create some of the useful services that our platform offers. Two of them in particular are wkhtmltoimage and wkhtmltopdf ( both 64-bit static binary files ). These two files convert HTML to either a PDF or a thumbnail image of a full webpage and dumps the output. These tools work flawlessly on both our sandbox environments ( Ubuntu 11.04 ) and our production image servers ( CentOS 5.5 ). When we try to execute these files on one of our new Imageservice cloud servers ( CentOS 5.4 ), we receive the dreaded:
Let’s start off with basics, what exactly is a segfault?
According to Wikipedia ( http://en.wikipedia.org/wiki/Segmentation_fault ):
A segmentation fault (often shortened to segfault) or bus error is generally an attempt to access memory that the CPU cannot physically address. It occurs when the hardware notifies a Unix-like operating system about a memory access violation. The OS kernel then sends a signal to the process which caused the exception. By default, the process receiving the signal dumps core and terminates.
Read more →