There are many tools or software out there claiming word password recovery or office password recovery (to crack the password of documents) and all these tools are getting good customer attention. Most of them labelled as free password recovery tool but in reality to get the full version we need to buy the software. But the question here is how many of you have succeeded with such tools! Most of these tools share common methods or concepts explained below and we should really think about success rate of such tools before spending money on them.
I believe if a document is strongly protected (128 bit encryption) with a password, as of now the possibility of cracking the password and open the document is almost nil! All we can do is try our luck (yes, I really mean it) by these tools. And if we are really lucky, voila, we got the password or else we can leave the password cracking tool running for millions of years!! (by thinking that it will crack document password or recover excel password etc.)
This article is for learning purpose only and trying to show vulnerability of legacy RC4 40 bit encryption.
Common type of attacks (methods of cracking – brute-force and dictionary attacks )
Password cracking tools usually uses any or all of the below basic methods, with some mix and match on the approach.
1. Dictionary attack
As in the name, in this method the tool will try to open the document by trying a set of all possible words/combination of words from a exhaustive list – called a dictionary. This again can be dictionary + numbers, number + dictionary + special characters etc. Usually there will be options where user can mention what kind of dictionary should be used, how many characters of length tool should try etc.
Dictionary attack usually will take less time to complete compare to Brute force attack, which may run forever!! I highlighted ‘to complete’ because completing the attack does not mean the password has been recovered. It may complete without any result and finally say ‘we could not find the password’ – damn heh.
2. Brute force attack
While dictionary attack try to crack document password by trying all the possible combination of dictionary words, the brute force will try to recover word document password by trying all the possible combination of alphabets, numbers, special characters, punctuations etc. Example like, start with a, then aa, then aaa a….., then next b, bb, bbb, b….other combinations ab, ac, ad etc. Each password recovery tool would be having their own method for these combinations and also there will be options like password minimum and maximum length, what kind of characters should be included – lower case, upper case, only alphabets etc. If we could mention at least some option, there is a chance that you can recover the password. If you are completely clueless about password, then there is very minimal chance to recover the password.
Ok, well, is there any guaranteed method (cracking RC4 encryption) ?
The good news (bad news to someone) is that, yes, we have some guaranteed method that will not recover the password but will remove the password (decrypt the document) so that we can open the document with out password.
The bad news (good news to someone) is, not all documents, especially documents created and protected using latest versions of software like MS office 2010 etc, can be unprotected. The reason being they use stronger 128 bit encryption and other such methods to protect the document.
So what kind of protection can be broken?
Well, if the document is encrypted using RC4-40 bit encryption, then we can break the encryption. Earlier versions of MS office (97/2000/XP/2002/2003) uses RC4 encryption and out of these MS Office 97/2000 uses only 40 bit encryption. XP/2002 and 2003 provided with an option to choose different encryption methods. Hence for XP/2002 and 2003, we can recover word or excel password only if it has been protected using default method – that is if user had not chosen any advanced option while setting his password.
How do we unprotect (decrypt) the document then (without recovering document password) ?
Before we check this, it is better to know what is RC4 encryption. RC4 is a stream cipher encryption algorithm which is used to encrypt text streams (of documents in our case). Once encrypted, to decrypt it uses a key which is generated out of a password.
So here is the catch, it uses a key to decrypt the document. So if we have the key we can decrypt the document without a password.
How to get the encryption key (RC4 encryption)?
The method to get the key remains same, brute force. But here the advantage is, the time taken for brute forcing the key is very very minimal compare to brute forcing the password. As it is a 40 bit encryption, the entire scope of the key is 240(called key space) combinations. So we will try all these possible combination to get the key, which can be then used to decrypt the document.
To search the entire key space with now-a-days advanced processors (Intel dual core, core 2 etc.), and with a single process (one thread), it will take only couple of days – may be a week. And if we use more advanced processor/ or multiple process (threads) it will break in minutes to hours.
See the below chart from Wikipedia showing theoretical limit of brute force attack on various levels of encryption.
Do you still believe if someone/tool claim he can brute force a 128 bit encrypted document and waste your money?
So watch out, there are many tools out there in the name ‘recover excel password’, password recovery xp, word password cracker, free password recovery tool etc. before jumping to buy that think twice and follow my second part of this article to see if you can create yourself one!!!
And there is one more new concept called GPU accelerated technique, which uses the power of graphic cards to brute force and the RC4 key can be brute forced in seconds or minutes!! Oh..man.
Once we got the key, then we can decrypt the document using common RC4 algorithm. (You can get this algorithm in various programming languages from the internet, so no worry)
How to validate the encryption key?
This question would have come to your mind when I said we will try all possible key combinations to check which key is the right one to decrypt. Well in case of word documents, the verification is doing against verifier string/characters (encrypted verifier) which is stored in the RC4 encryption header in the document. So we should also know a little about document headers and the document structure.
You may read more about the MS Word document header structure here. (Source: Microsoft MSDN)
If you would like to know how programmatically read the document header, what the structure of a document is, how we validate the key against the verifier string and very importantly, how technically we can crack the document please check my next post Crack document password – rc4 decryption (part 2)
I would also suggest you to do a search on Internet using some of the new terms and key words learned here to learn further if you are interested.
Security concerns – last but not least
Hence if you are really concerned about the security of your documents, I would recommend you to always select 128 bit and other advanced security options from the option ‘Advanced’.
Also when you plan to buy a password cracking/recovering tool think twice.
Don’t forget ask your questions in the comment. Have you already knew/tried any of such methods explained here? tell us more..
( “First image courtesy: “Stuart Miles” / FreeDigitalPhotos.net”)