More on setting a subversion mirror repository
A few days ago I wrote about setting up a subversion repository using svnsync. On that entry I was using svn+ssh authentication, but there are some gotchas to take into account to avoid having a security issue.If you recall correctly, we had a unix user with which people will access the repository. If we haven’t taken any extra protection, that means that anyone that has their ssh key on such a user’s ~/.ssh/authorized_keys file can log into the system. If such user uses public authentication to access the remote account used for the remote repository, any commiter can have access to the private key and thus access the remote machine and have write permissions on it.
To avoid this scenario we need to set things up in a way that users can commit to the main repository but cannot have access to the remote repository. And that, again, can be done by creating a different user (Remember, all this applies to the main repository box, not the remote box).
Say:
- svn: The user used for the repository. It will let any user have write permission on the repository. It will also have all the public keys of all the commiters, to allow them to access the server using the svn+ssh authentication under the generic user svn, in case it needs to be retrieved from a remote location, which can be easily used by forwarding your credentials.
- svnsync: A user which will have access to the repository (read access is enough) and will have a couple of wrapper scripts to svnsync. It will use a ssh key to access the remote repository. The rest of the users will have no read permissions on this user’s home directory.
- The rest of the users for the commiters. They will have write access to the repository. As you may know, if you use svn+ssh authentication, whenever you do a commit, you basically are executing svnserve on the repositoy under your user ID (That’s why you need write permissions).
This way, commiters won’t have access to the svnsync private key. But we must grant them access to the wrapper scripts, so whenever they do a commit, svnserve is able to execute those scripts by triggering a post-commit hook.
All these can be achieved by using sudo. So, say that we have two different wrapper scripts:
- /home/svnsync/bin/synccommit.sh
#!/bin/bash # this is to synchronize the repostory # $1 will be the repository if [[ -z $1 ]] then # The repository should be a parameter exit fi DATE=`date +%Y%m%d-%H%M ` echo "[$DATE] Commiting to repository $1 . (output below if any)" >> /home/svnsync/svnsync-commit.log 2>&1 /usr/local/bin/svnsync synchronize --username svnsync "$1" >> /home/svnsync/svnsync-commit.log 2>&1
- /home/svnsync/bin/syncprop.sh
#!/bin/bash # this is to synchronize the a property # $1 will be the repository # $2 will be the revision number if [[ -z $1 || -z $2 ]] then # The repository should be a parameter exit fi DATE=`date +%Y%m%d-%H%M ` echo "[$DATE] Updating property (Revision $2) to repository $1 . (output below if any)" >> /home/svnsync/svnsync-properties.log 2>&1 /usr/local/bin/svnsync copy-revprops --username svnsync "$1" "$2" >> /home/svnsync/svnsync-properties.log 2>&1
We can call them by setting up the hook scripts as:
# Propagate the data to the remote repository sudo -u svnsync /home/svnsync/bin/synccommit.sh svn+ssh://svnsync@remote/home/svnsync/svn &
* *post-rev-change*:
# Propagating changes to the remote repository. Putting it to the background sudo -u svnsync /home/svnsync/bin/syncprop.sh svn+ssh://svnsync@remote/home/svnsync/svn $REV &
All we need to do now is grant access to the commiters to execute both wrapper scripts by adding the following to the /etc/sudoers file (remember, by using visudo)
%commiters ALL= (svnsync) NOPASSWD: /home/svnsync/bin/syncprop.sh svn ALL= (svnsync) NOPASSWD: /home/svnsync/bin/syncprop.sh
Where commiters is a common group where all commiters belong to.
Speeding up trac’s response time
I’ve been trying to speed up an installation of trac over the last few days. The web interface took ages to display each of the directories or files within the subversion repository. But this one wasn’t too big. The only change to the subversion repository is that we started using a vendor branch imported into our main repository using svm
So, after a few hours trying different solutions, and reading trac’s source code, I think I got where the bottleneck was.Well, it was http://www.sqlite.org/download.html which was causing the bottleneck. Trac uses an object CachedRepository to access the repositories. Whenever we want to get the chagesets, a function to synchronize the repository is called:
class CachedRepository(Repository): def get_changeset(self, rev): if not self.synced: self.sync() self.synced = 1 return CachedChangeset(self.repos.normalize_rev(rev), self.db, self.authz)
and such method, sync(), makes a call to:
youngest_stored = self.repos.get_youngest_rev_in_cache(self.db)
which is all this:
def get_youngest_rev_in_cache(self, db): """Get the latest stored revision by sorting the revision strings numerically """ cursor = db.cursor() cursor.execute("SELECT rev FROM revision ORDER BY -LENGTH(rev), rev DESC LIMIT 1") row = cursor.fetchone() return row and row[0] or None
And that SQL query was taking around 1-2 seconds each time it was executed. It happened that we were running an old version of sqlite and pysqlite, so a ./cofigure && make && make install using the recommended installation saved my day :-)
Hope it is useful to anybody if it gets indexed by Google.