I was interested in exploring how WordPress has changed over time. The first public release, version 0.7, was in May 2003. In the nearly 20 years since, its user base has grown vastly and in all that time it has been under constant development. So how much has the code base grown? Has that growth been linear or exponential? Has the balance of programming languages changed? To find out, I counted the number of lines of code in each version throughout its history.
Lines of Code
Lines of code (LOC) is a terrible metric for measuring quality and it tells us next to nothing about the value contributed by those lines. It is purely a measure of scale, but I think that is interesting in itself.
Different languages are more concise than others and can tend towards more or fewer lines of code for equivalent functionality. This is not so important when we’re looking at trends within a project as I am here though, rather than comparing it against others.
Method
To perform the counts, the cloc tool was used on a clean version of WordPress as pulled from the Github mirror of the core WordPress repository.
- The
wp-content/themes
andwp-content/plugins
directories were excluded from the counts, to avoid skewing the numbers by which default themes and plugins were packaged. - The counts also exclude comments and blank lines.
- Versions of WordPress prior to 1.5 are not in the git repository, so have been omitted for simplicity.
- WordPress contains many third-party packages, like jQuery, React and TinyMCE. I’ve opted to exclude these as much as possible, as well as any minified CSS and Javascript files (which WordPress also provides un-minified, so it avoids double counting).
There is a separate development repository for WordPress which only contains the source files with build scripts. It would make a lot of sense to be using this repository instead, however this repository only took its current form in version 3.7, so wouldn’t give like-for-like comparison for the period prior to that.
The full code used is available on Github, if you want to reproduce/extend this study, or suggest improvements.
Results
It’s not much of a shock to discover that WordPress has grown substantially since its earliest versions. It is surprising to see just how much it has expanded in more recent versions though. The chart below shows a leap on version 5.0, most likely due to the introduction of the Gutenberg block editor. Since version 5.0 the rate of growth has clearly accelerated with over 200k lines of code added. Version 5.9 has over twice the number of lines as 4.9.
Breakdown by Language
To get a clearer idea of where the growth has come from, we need to split the results out by language.
It’s important to note the limitation here in the way cloc determines programming language. Each file is treated as one language only, cloc cannot parse a file to separate languages within a file. So for example, a PHP file may contain many lines of HTML markup – but it will all be counted as PHP.
The ‘other’ group in these results covers XML, SVG, Sass, Markdown and JSON.
As expected, the increase in size since version 5.0 is largely due to the Javascript introduced for the Gutenberg editor. Despite its reputation as a PHP based CMS, since version 5.0 WordPress has contained almost as much Javascript as PHP. In fact if you include the Javascript libraries, it contains more lines of Javascript than PHP (317,049 vs 257,918 in version 5.9).
The amount of CSS seems extraordinary, considering this is only for the admin. There was an almost doubling in the lines of CSS between versions 3.7 and 3.8, when a new admin interface was introduced, and it’s increased six-fold since then. Naturally all those blocks since 5.0 require more styles.
Breakdown by Time
The WordPress release cycle hasn’t always been very regular. For example, there was less than 3 months between versions 3.6 and 3.7, but over a year between 4.9 and 5.0. So it’s interesting to check the rate of advance over time, not just released versions.
Clearly this does nothing to smooth out the rate of progress. The increased rate of expansion since 5.0 is very apparent. It’s probably explained by the new surface area that Gutenberg added. Rather than refining and extending existing features, suddenly there was a whole new territory to expand into with entirely new blocks.
Conclusions
How much all of this really matters is debatable. It would be easy to say this rapid expansion of the WordPress code base demonstrates how bloated WordPress has become. But I think that understates the changes that have occurred in the WordPress project over time. The truth is WordPress is no-longer a PHP CMS, it’s a hybrid PHP-Javascript beast.
Throughout WordPress’s history it has always endeavoured to maintain complete backwards compatibility. That will inevitably incur a cost in terms of lines of code. It’s also insisted on a stubborn policy of support for out-dated PHP versions, which has excluded some of the more expressive syntax. Take a look at Brent Roose’s Evolution of a PHP Object post, for an extreme example of what a difference this can make.
My concern is the direction and steepness of the trend. The rate of expansion is doing anything but slowing. The four phases of WordPress’s current blocks roadmap is only at the half-way stage, so I think we can expect the trend to continue for the foreseeable future. Perhaps, when the domination of full-site editing with blocks is complete, we’ll see a refactor to remove the old methods for managing menus, widgets etc. I won’t be placing any bets on it though.