It is exceptionally difficult to keep various content management systems up-to-date against the number of security patches that are released. However, many sites are powered by software such as; WordPress or Drupal etc.
A good site admin (one following various “lockdown” guides) will undertake steps to remove version numbers being disclosed in HTTP response headers, or within returned content as per recommendations like those linked to below:
- WordPress – How to remove version information
- WordPress – Remove TXT files and “readme.html”
- Drupal – Remove TXT files commonly used to enumerate versions
- Drupal – Remove ‘X-Generator’ Response Header
As security professionals we also tend to recommend such steps since it pro-actively helps you engage customers in securing their site. However, the majority of attacks against publicly known issues are conducted by blind brute-force. Real-world attackers simply do not bother to check for version information before they fire the exploit code at you. What is it to them if their illegal activity causes your site to crash or get defaced ?
It is just down right unprofessional to fire public exploits at a target and hope something sticks in the manner a real threat agent would. So, in choosing to “secure” your site, you may effectively only be masking problems that your rather expensive penetration testing provider would otherwise have located.
How to ensure that your customer is not vulnerable when they have undertaken steps to obscure full version information? This was the question I had to answer last week.
To illustrate the work flow understand the following steps:
- You come across a target using Drupal
- You observe version 7 in the HTTP response headers but are unable to obtain specific minor version information.
- You turn to an established fingerprinting technology such as Blind Elephant and point it at your target:
BlindElephant.py https://TARGETSITE/ drupal
Loaded /usr/local/lib/python2.7/dist-packages/blindelephant/dbs/drupal.pkl with 145 versions, 478 differentiating paths, and 434 version groups.
Starting BlindElephant fingerprint for version of drupal at https://TARGETSITE/Hit https://TARGETSITE/CHANGELOG.txt
File produced no match. Error: Failed to reach a server: Not FoundHit https://TARGETSITE/INSTALL.txt
File produced no match. Error: Failed to reach a server: Not Found
Error: All versions ruled out!
This has failed us because it did not find one of two files. The approach for BlindElephant is (I believe) reliant on maintaining a database of files to check for centrally. To me that sounds like a lot more work than I am willing to put into life!
Then I was thinking to myself; “But cornerpirate, the site is powered by code which is entirely available on github.com, can’t we use the features of git to give a really robust answer?”. A few hours later enter “git-version”:
The answer was “yes we can!”. The work flow with git-version is a little different:
- You come across a target using Drupal
- You observe version 7 in the HTTP response headers but are unable to obtain specific minor version information.
- You clone the public facing github for drupal 7:
- git clone -b 7.x –single-branch https://github.com/drupal/drupal.git
- You do offline reconnaissance against your newly download drupal 7 folder. This equates to “hunting for static content”:
- Find a unique list of file extensions (inside the new ‘drupal’ directory: find . -type f | perl -ne ‘print $1 if m/\.([^.\/]+)$/’ | sort -u
- Review the output above to find anything static. This will at least be; *.txt, *.html, *.js, *.inc, *.sql in the case of Drupal. There are potentially a few more in there.
- Create a list of the file names for such static content: find . -name ‘*.inc’ > inc-files.txt
- Repeat for all interesting file types.
- You now have a list of files you want to check for on the target site.
- From here you need to try and download every single one of those files from your target site.
When you find a file simply download it and then use git-version to check which revision that file is at. Ideally you want to base your version on something which has hundreds of revisions. In the case of Drupal those *.inc files appear to be good candidates.
In my case the site allowed access to “bootstrap.inc” which I then passed as input into git-version:
git-version.py bootstrap.inc drupal/
Found at [9/637]: https://github.com/drupal/drupal/blob/9f72251c9291b5613acb9ca4ea7a51b4739e3f93/includes/bootstrap.inc
Here we have a ‘mildly’ outdated site. The most recent version is 1/637 where 637 is the total number of revisions. As we are using 9/637 there are 8 newer revisions.
If you visit the URL provided it will take you to the raw version where you can typically learn things from the commit message:

Great success! That version of the file is literally md5 checksum identical to the version in release 7.41 of Drupal.
Also note, as it happens, that the ‘bootstrap.inc’ file happens to helpfully announce the version anyway. So in the case of drupal 7, we could replace the entire ‘git-version’ tool workflow with:
wget -qO – http://<targetsite>/includes/bootstrap.inc | grep “‘VERSION'”
But I didn’t bloody know that at the start….
Anyway. Fingerprinting with git is here and it is going to be useful.