svnadmin: Decompression of SVNDiff Data failed and fsfsverify.py

Subversion, Subversion

Subversion is generally a pretty robust code versioning tool.  If you manage Subversion repositories, the last thing you want is a corrupted repository.  It can cause all sorts of issues:

  1. User’s won’t be able to update their working folders correctly
  2. You may be prevented from getting the head (head means the latest commit) revision
  3. Even worse, backups may fail!

All instructions provided below are for Windows FSFS repositories only.  If you don’t know which repository type you have (BerkelyDB or FSFS), see my other post here.

BACKUP YOU FOOL!

BEFORE you go any further, BACKUP YOUR REPOSITORY STUPID. I don’t have clearcut instructions yet on doing SVN backups, so Google it yourself. :).   If you have an FSFS repository, ensure no users are currently committing and then you can backup the entire folder of your repository in Windows, BUT as I said, please make you have done your due diligence and backed up the repository first!

  1. Open a command prompt on your svn server and make sure you have sufficient admin privelages.
  2. Type “svnadmin verify <location of repository>” where <location of repository> uses the notation c:\<repository>.  You might need to locate the svnadmin.exe executable on your system if its not in your default path and run from there.
  3. If all goes well, you should see a series of “verification messages shown below and then be returned to the command prompt.  Note that in my example, I am using VisualSVN a one-click SVN installer so where I ran svnadmin from might be different then your setup but the concept is the same.image
  4. Now oddly enough, this might not actually be giving us the true picture of our repository since svnadmin verify doesn’t always verify every single revision for us.  So we need to check our actual repository revision number to confirm it ran through them all.For that, go to <your repository location> and have a look at the last revision number in <your repository location>\db\revs
  5. If things DON’T go well, you will get something like this “svnadmin: Decompression of svndiff data failed”image
  6. Svnadmin has an option as well to repair repositories but can only do so on Berkeley DB’s.  For FSFS repositories you will need to download a Python script called fsfsverify originally written by John Szakmeister.  You can download the latest version of the fsfsverify script from him (dated February 20, 2008) at http://www.szakmeister.net/fsfsverify.tar.gz.  If that link doesn’t work you can also get a copy from http://svn.collab.net/repos/svn/trunk/contrib/server-side/fsfsverify.pySave the file in a location of your choice.  I used the location of the svn executables in my example here.image
  7. Now, without the Python script interpreter, your computer can’t deal with .py files, so you also need to install Python.  Here’s where I got mine (I took the most recent one listed): http://www.python.org/download/releases/
  8. Now, assuming you made a backup of your repository already, run fsfsverify as “fsfsverify.py –f <location of repository>\db\revs\<next revision number>” where <next revision number> is the next revision number you were expecting to see from svnadmin verify.   You can see my example here.image
  9. Hopefully at this point your repository has been fixed.

FSFSVerify never fixed the problem for me

Unfortunately, in my case, fsfsverify didn’t working as expected.  I think my repository is completely broken at the moment.   After running fsfsverify as I described above, it tells me to run fsfsverify without the –f option.

image

I did so, and I get

Error InvalidCompressedStream: Invalid compressed instr stream at offset 14534587 (Error –3 while decompressing: incorrect header check).  Try running with –f to fix the revision.

It tells me to run fsfsverify with the –f option.  So here I am stuck in a loop without any real fix to my repository.

image

Unfortunately, I don’t have a fix available either so that really sucks.  I am trying to get some answers and help from John Szakmeister so hopefully he can help.  If I have some new information for you, I will post a blog entry

Related entries: Subversion related posts

Update 7:30pm: I received a reply from John and he said “Just an FYI: it sometimes takes more than one run through with fsfsverify.  The root cause of the corruption in a great many cases is that there is a repeated block of data present in the data stream. I’ve seen instances where that block was repeated 9 times.  So it took 9 passes via fsfsverify before it worked again”

I am going to give his recommendation a try.  I think I have run it about 5 or 6 times already but will run it again and watch the results more closely

Update October 21st 2008: John Threadgill from Atlas Tech emailed me just now.  My post info above was able to help him.  Perhaps his experiences will help you.  See below….

===========

Related Issue recieved by a visitor: Malformed File

I recieved the following email from John Threadgill October 21, 2008..

Hi Mayur

After having a similiar problem with one of our subversion FSFS repositories (GeoMaster). We are using VisualSVN 1.6.1. From a client PC I was unable to check out a branch or perform a show log in TortoiseSVN in this repository. These operations resulted in messages such as “Could not read chunk size”.

I used :

    svnadmin verify GeoMaster

on the server to try and find the problem. It did not flag anything as being wrong. I used John Szakmeister’s fsfsverify, targeting various revisions where I thought the problem was and that didn’t seem to find anything either. The GeoMaster repository is relatively large with over 13000 revisions. The problem seemed to be in the range between 13050 and 13100. In the end I identified the revision by doing a:

directly on the server. This cuts out apache and buffering etc. This choked on revision 13055 with the message “malformed file”.

After a bit of searching I came across a blog (dated way back in Feb 2005) which mentioned the malformed file error: http://www.raditha.com/blog/archives/000668.html

The culprit here turned out to be a trivial error in the revision properties. These are stored in the db/revprops directory of the repository. In my case, 13055 had a invalid revision properties file. The value for svn:log was preceded with the line V 262. The 262 is supposed to be the length of the value. The actual log comment lines were only 173 characters in length. Once I had corrected this, the repository was fine. I could perform a show log and check out the branch with no problems.

Not too sure what could have caused this or how to avoid it in future. The svnadmin verify did not catch it. Maybe fsfsverify could be extended to cover this type of error. The offending revision was a fairly large commit (over 470 files in 6 directories) and our LAN is only 100m/bit and occasionally can really grid when things are busy. Maybe these were contributing factors. I am surprised that an error that existed over 3 years ago could still be lurking in the SVN code.

I thought I should let u know my experiences. Perhaps you could post it as a comment at the bottom of you blog article and it might help someone else?

Thanks for posting your problem in your blog. It helped point me in the right direction.

Cheers & happy coding,

John

==========
John Threadgill
ATLAS Technology
www.atlastech.co.nz


One Response to “svnadmin: Decompression of SVNDiff Data failed and fsfsverify.py”

  1. englund Says:
    July 29th, 2008 at 7:26 am

    Did you ever find a solution for this?
    I’m having the same problem.. I’ve run fsfsverify like 15 times now..

Entries RSS Comments RSS Log in