At LexBlog, we manage a lot of sites with a small (but mighty!) team. While we carefully introduce new features on a regular basis through a combination of automated and functional tests, it’s much easier to trust the process (any Philadelphia 76ers fans out there?) when your team is responsible for writing that functionality. However, as LexBlog’s platform is built on WordPress and includes a variety of third-party plugins not written by LexBlog’s product team, we’re often put in a position to introduce new code to the platform without having the luxury of reading each line. In fact right now, we’re preparing for a core update now that WordPress 4.9 has been out long enough to see a security release added to the initial point release.

In our line of business, this is fraught with peril as not all sites are created equally (meaning they often run different bodies of code) and the standard at LexBlog is high where a few pixels of change is cause for concern. So how do we do it?

It starts with a lot of research. Each core update of WordPress is reviewed to see what new features are being introduced. LexBlog never lags behind a security release, but for functional releases we may choose to wait one or two cycles to let any bugs get identified and resolved by the core team. We also review the change logs of each third-party plugin with a pending update. This ensures we can note any major functional changes and communicate with our clients as needed.

The next step is automated testing, and a lot of it. LexBlog is partnered with WP Engine who provides a replicated staging environment, completely separate from our customers’ production environments. Here we can smoke test core and plugin updates on sites that look and act just like their live versions. We make use of an in-house application that communicates with the staging environment to determine what sites to test and, through the power of Selenium, actually visits those sites with a browser controlled by our application and takes a screenshot of how it currently looks. Before making an update in staging, screenshots are taken of all soon-to-be-updated sites. This screenshot is sent to Applitools via their API, and is considered the baseline test (i.e., how the site looks before anything is updated). We then update core and any plugins in staging and rerun our automated visual tests with Applitools and Selenium checking for any differences between the updated versions of the sites and the baseline test. As we notice issues, tickets are created in our internal ticketing systems and patches applied before any updates are made in production.

LexBlog also uses a variety of monitoring tools that can warn us in the case of a code-level error. WP Engine again helps us out with PHP/Apache error logs that we can review for these staging environments. In addition, each site on our platform has a monitoring service – Sentry – that reports on PHP and JavaScript errors, complete with a stack trace (not available inside WP Engine) for our team to review. Similar to our visual regression tests, if a new error is logged in staging, our team investigates and resolves where needed.

Of course, all this isn’t quite enough to help me sleep peacefully at night when we run a production update, so we have a few other tools available. Each production installation on our platform has New Relic monitoring installed, which allows us to see in real-time how our network is performing. If we notice changes after an update, then it’s time to investigate. We also make use of Pingdom and StatusCake as a last line of defense. Both of these services report on uptime of individual domains across our network in the case all of our testing missed something and a site on our network is experiencing downtime.

It’s a far cry from just enabling automatic updates in WordPress and clicking the update button each time a plugin has a new release, but I personally wouldn’t have it any other way.