Not Really a Blog

December 4, 2006

Speeding up trac’s response time

Filed under: Internet, Programming — Tags: , , , , — jesus @ 18:30

I’ve been trying to speed up an installation of trac over the last few days. The web interface took ages to display each of the directories or files within the subversion repository. But this one wasn’t too big. The only change to the subversion repository is that we started using a vendor branch imported into our main repository using svm

So, after a few hours trying different solutions, and reading trac’s source code, I think I got where the bottleneck was.Well, it was http://www.sqlite.org/download.html which was causing the bottleneck. Trac uses an object CachedRepository to access the repositories. Whenever we want to get the chagesets, a function to synchronize the repository is called:

class CachedRepository(Repository):
  def get_changeset(self, rev):
    if not self.synced:
      self.sync()
      self.synced = 1
      return CachedChangeset(self.repos.normalize_rev(rev), self.db, self.authz)

and such method, sync(), makes a call to:
youngest_stored = self.repos.get_youngest_rev_in_cache(self.db)

which is all this:
def get_youngest_rev_in_cache(self, db):
    """Get the latest stored revision by sorting the revision strings
    numerically
    """
    cursor = db.cursor()
    cursor.execute("SELECT rev FROM revision ORDER BY -LENGTH(rev), rev DESC LIMIT 1")
    row = cursor.fetchone()
    return row and row[0] or None

And that SQL query was taking around 1-2 seconds each time it was executed. It happened that we were running an old version of sqlite and pysqlite, so a ./cofigure && make && make install using the recommended installation saved my day :-)

Hope it is useful to anybody if it gets indexed by Google.

November 30, 2006

Setting up a subversion mirror repository using svnsync

Filed under: Internet, Programming — Tags: , , , — jesus @ 01:48

With the new version of subversion 1.4 you have a new tool called svnsync with which you can maintain mirror repositories quite easily. I’ve been working on one at work and would like to share with you my findings in case it interests anyone :-)

Understanding the problem

In order to have a mirror repository, it is important that commits only happen on the master and then they are synchronized to the mirror using the svnsync program. Then, the mirror repository can be used for whatever you may think of but for committing: backup, high speed checkouts, web front-ends, etc.

So, svnsync must be the only “one” able to commit to the repository. If we use the apache integration, there are various ways to do this. Let’s say we are using svn+ssh for authentication, in which case it is more complicated as ssh access usually grants writing access to the file system. So creating a different user is going to be handy.

Creating and populating the repository

So, let’s say that we created a user called svnsync on the target machine and that we are going to create a new subversion repository in its home directory:

svnadmin create /home/svnsync/svn

Now, we need to set up a hook to let svnsync change the properties. For this, we create /home/svnsync/svn/hooks/pre-revprop-change with:

#!/bin/sh
USER="$3"

if [ "$USER" = "svnsync" ]; then exit 0; fi

echo "Only the svnsync user can change revprops" >&2
exit 1

We will grant access to the user running svnsync on the main machine by copying its ssh key to .ssh/authorized_keys. And now, we only need to initialize the remote repository. Note that we can run svnsync from where ever we want, but for the sake of simplicity, we will run it on the main machine, where the original repository resides.

$ svnsync init --username svnsync \
      svn+ssh://svnsync@remote/home/svnsync/svn \
      file:///source-svn

Note:

  • The syntax is
    svnsync init DEST SOURCE

    That’s it, the destination repository goes before the source repository.

  • There is no “:” between the host name and the path to the remote repository.

With this command, we will have initialized the destination repository and now we are ready to populate the destination repository. We can do it with this command:

svnsync synchronize --username svnsync \
       svn+ssh://svnsync@remote/home/svnsync/svn

And, as we already initialized the repository, there is no need to specify the source repository. This will take more or less time depending on how big you repository is and how fast your network connection is. Hopefully it will have finished after you take a coffee :-)

Creating another user to let users access the repository

So, we will now create an user called svn which will be used to access the repository using the subversion client. As we are using svn+ssh, all we need is to grant access to such user to all the users that have access to the main repository. If we are using ssh keys it’s as easy as copying all the allowed keys to the /home/svn/.ssh/authorized_keys file.
Also, if we change the permissions to the repository at /home/svnsync/svn (and its parent) to be something like

drwxr-xr-x  7 svnsync users 4096 2006-11-28 17:30 svn/

we will let svnsync (and the svnsync user) be the owner and have write permissions to the repository and let svn (and all the users ssh’ing) have read access only to the repository (provided both belong to the users group).

$ svn co svn+ssh://svn@remote/home/svnsync/svn/test
A    test/trunk
[...]
A    test/trunk/readme
Checked out revision 2.
$ echo "test" >> test/trunk/readme
$ cd test/
$ svn -m "test commit" ci
Sending        trunk/readme
Transmitting file data .svn: Commit failed (details follow):
svn: Can't create directory '/home/svnsync/svn/db/transactions/2-1.txn':
Permission denied

And that’s all.

Commiting to the master respository

In case you want to commit back to the master respository, you need to do a “svn switch –relocate” to point to the master repository, but for that to work, it needs to have the same UUID if we don’t want it to fail.

  1. To get the UUID on the main machine:
    svnadmin dump -r0 /source-svn | head -n 3 > saved-uuid
    
  2. Copy the file saved-uuid to the remote machine and do a
    svnadmin load --force-uuid /home/svnsynd/svn < saved-uuid
    

So, thins to take into account:

  1. When populating the repository, we use the svnsync user who has write permissions to the repository (svn+ssh://svnsync@…)
  2. When checking out, we use the svn user (svn+ssh://svn@…)
  3. Automatization

    If we want to keep both repositories synchronized, we need to set up a couple of hooks on the source repository.

    post-commit

    Add this to such hook (it needs to be executable and not have the .tmpl extension)

    # Propagate the data to the remote repository
    /usr/local/bin/svnsync synchronize --username svnsync
            svn+ssh://svnsync@remote/home/svnsync/svn &
    

    post-rev-changes

    We also want the properties changes to be present at the remote repository:

    # Propagating changes to the remote repository.
    /usr/local/bin/svnsync copy-revprops --username svnsync
           svn+ssh://svnsync@remote/home/svnsync/svn $REV  &
    

    Note that we put them to the background to let the user go on with what he was doing.

    Final notes

    This is a quick guide on how to set things to have a remote repository. There is much more than this and I recommend you to read the documentation and, obviously, do a backup. Doing a “svnadmin dump” only takes a while a it´s really worth it.

    In any case, just let me know if you find any errors or typos.

November 9, 2006

How to become a unix guru

Filed under: Fun, Internet — Tags: , — jesus @ 16:34

Just read this on the irc channel #netbsd on freenode. Found it funny :)

<alphawaves> how can I become a unix guru?
<alphawaves> is it only possible?
<sqweek> do you really want to?
<alphawaves> uhm y eah i try to several years
* AccessExcess has quit (Remote closed the connection)
<sqweek> do you still want to after i tell you that aside from general understanding it requires rote learning of countless quirks and inconsistencies between tools/distros/systems?
<alphawaves> uhm yeah i try since several years
* arnito has quit ("bye")
<marcello_> write your own unix kernel
<sqweek> well, it's the same as learning about everything else
<sqweek> strive for understanding

August 16, 2006

Playing with web services

Filed under: Internet, Programming — Tags: , , — jesus @ 03:25

Some of you may know that I run a website for displaying the temperature in Seville which has many visitors, mainly from Seville. I’ve run that website since 2001 and it is actually number one when you perform a google search on temperatura en sevilla. I find it quite useful.

Eversince I have implemented various related services, basically playing around in my free time. Some of them are a RSS service and a jabber service (which you can subscribe by adding sevillatemp@jabber.org to your jabber application – even Gtalk -). There are people that use it on a daily basis and I am glad that they do :)

So, I had this idea in mind since some time, I just wanted to create a web service so other people could use it on their applications, be it web or desktop application. that would me let me play with webservices and, also, with Ruby.

So, I did some coding and implemented something that I would like you to know.

The web service

The temperature is read from a couple of sensors from Dallas Semiconductor using a program called digitemp. It is a unix command line program that is used by a set of shell scripts that create all the fancy graphics you can see at http://temp.roncero.org using rrdtool. These scripts are executed every five minutes to both insert the data into the database and generate the graphs. The system usually works ok, but it has some flaws, mainly because of old hardware and by the fact that I am living in London and the machine is in Seville.

In order to avoid problems with the actual system, I added a ruby script to the previous shell scripts to generate an XML file that would be later on used by the web service. The file content is

    1030AC15000800CB
    27.38
    Wed Aug 16 03:00:05 CEST 2006
    Celsius

    10952F1A00080063
    21.44
    Wed Aug 16 03:00:06 CEST 2006
    Celsius

This is what I am using to feed the web service. Notice that each of the sensors has a unique serial number and that would be used by the calls to the webservice. I created a small ruby program mapped to the url http://temp.roncero.org/temperature in order to work. There are two different methods now:

  • list_sn* used to discover all the S/N available.
  • read_temp(sn)* used to read the data from a sensor. *sn* is the serial number of the desired sensor.

Client programs

So here are two different client little programs that you can use to access the data. One in ruby and another one in python. Very simple, no error check.

Ruby

#!/usr/bin/env ruby

require 'soap/rpc/driver'
proxy = SOAP::RPC::Driver.new("http://temp.roncero.org/temperature",
"http://temp.roncero.org/temperature")
proxy.add_method('read_temp', 'sn')
proxy.add_method('list_sn')
puts "List S/N: #{proxy.list_sn}"
puts "reading from s/n  1030AC15000800CB"
puts "#{proxy.read_temp("1030AC15000800CB")}"
puts "reading from s/n  10952F1A00080063"
puts "#{proxy.read_temp("10952F1A00080063")}"

Python

#!/usr/bin/env python

from SOAPpy import SOAPProxy

url = 'http://temp.roncero.org/temperature'
namespace = 'http://temp.roncero.org/temperature'
server = SOAPProxy(url, namespace)
print "Printing S/N"
server.list_sn()
print "Temperature for 1030AC15000800CB"
print server.read_temp("1030AC15000800CB")
print "Temperature for 10952F1A00080063"
print server.read_temp("10952F1A00080063")

you can test both right now or come up with a new version in another language or improve these ones. Here is a typical output:

$ ./client.rb
List S/N: 1030AC15000800CB, 10952F1A00080063
reading from s/n  1030AC15000800CB
27.31 degrees at Wed Aug 16 03:20:05 CEST 2006
reading from s/n  10952F1A00080063
21.25 degrees at Wed Aug 16 03:20:06 CEST 2006

So

Yeah, this is a work-in-progress thing. I have no idea if this is the way to do it. No WSDL, not yet. Expect it to fail at some time, as I have yet to implement more error checking, etc.

So, I would like all of you, if you have an opinion on these kind of things, to comment or suggest, say if you would like it to do something different or better.

I have done some assumptions basically because my lack of knowledge on the matter.

  1. I have created the webservice using ruby’s webrick own webserver on a high port and mapping it to an apache url using proxypass directives. Is this the way to do it?
  2. I only created two methods. Would make sense to have more methods? Like one that would only return the temperature value?

Have something to say? Go ahead :)

March 9, 2006

Google Calendar

Filed under: Fun, Internet — Tags: — jesus @ 18:50

So, Google is preparing another web application for all of us to enjoy. A web calendar to keep track of all the things you could put on your calendar, like iCal or simmilar. It looks like it’s going to be possible to sync them from your favourite calendar desktop application.

There are some good screenshots of this Google Calendar here. The cuestion is, if google is owning your email, being the internet search master, and deploying a bunch of other interesting services (Google Talk and Google maps among others)… Isn’t having all this information about you too much? I mean, someday we might end up eating some google snack which would put some RFID or GPS controlled chip inside our body and then, gradually, all of us would be under google control ;-)

Sounds scary, but I am looking forward to testing this new application, hehehe

Chord

Filed under: Internet, Programming, Sussex University — Tags: , — jesus @ 18:01

As part of another assignment for the Distributed System course at the University of Sussex we have been doing some research on Chord, a distributed hash lookup primitive, and, in the end, implement part of what is called a chord ring in java.

The idea behind Chord is a research paper in which a distributed protocol for adding nodes to the system, insert data into it, retreive it and drop from the system is described. The way in which nodes lookups and inserts are done is O(log n), so it’s quite efficient. Chord could be look as a layer over which p2p-like applications could be built. One example of this is CFS, the Cooperative File System, a distributed read-only file system.

What we had to do is, basically, implement a simple Ring protocol and a Chord protocol:

  • Ring protocol: Each node in the system has a pointer to the next one in the ring, so, for instance, if any node wants to do a lookup on some data, it would ask the next one in the ring. If such node does not have it, it would forward that question to the next node in the ring and so forth. This would be O(n)
  • Chord Protocol: In this version of the protocol, each node not only has a pointer to the next node in the ring, but it also has a pointer to a number of other nodes, not necessarily in a row, so when any node wants to do a lookup, depending on a hash function and the list of pointers to other nodes, it would forward the question to a closer node, where the information is more likely to be stored. This is the way we get a O(log n) algorithm. In fact, there is more than this going on, so I recommend you to have a look at the chord paper again if you are interested.

So, all we had to do is implement both protocols in java and run it in a
simulator, as implementing a working solution was too much for an university assignment due in two weeks.

What happens is that using the simulator and programming the whole thing is harder than it sounds, as the simulator itself is lacking some good documentation and have some bugs. In any case, doing it has been fun and interesting as well, as it has let me understand how a distributed system, such as a p2p system, works. So, I recommend anyone interested to have a look at how the protocol works (ie, by reading the Chord paper ;) )

The idea behind Chord is a research paper in which a distributed protocol for adding nodes to the system, insert data into it, retreive it and drop from the system is described. The way in which nodes lookups and inserts are done is O(log n), so it’s quite efficient. Chord could be look as a layer over which p2p-like applications could be built. One example of this is CFS, the Cooperative File System, a distributed read-only file system.

What we had to do is, basically, implement a simple Ring protocol and a Chord protocol:

  • Ring protocol: Each node in the system has a pointer to the next one in the ring, so, for instance, if any node wants to do a lookup on some data, it would ask the next one in the ring. If such node does not have it, it would forward that question to the next node in the ring and so forth. This would be O(n)
  • Chord Protocol: In this version of the protocol, each node not only has a pointer to the next node in the ring, but it also has a pointer to a number of other nodes, not necessarily in a row, so when any node wants to do a lookup, depending on a hash function and the list of pointers to other nodes, it would forward the question to a closer node, where the information is more likely to be stored. This is the way we get a O(log n) algorithm. In fact, there is more than this going on, so I recommend you to have a look at the chord paper again if you are interested.

So, all we had to do is implement both protocols in java and run it in a
simulator, as implementing a working solution was too much for an university assignment due in two weeks.

What happens is that using the simulator and programming the whole thing is harder than it sounds, as the simulator itself is lacking some good documentation and have some bugs. In any case, doing it has been fun and interesting as well, as it has let me understand how a distributed system, such as a p2p system, works. So, I recommend anyone interested to have a look at how the protocol works (ie, by reading the Chord paper ;) )

 

February 5, 2006

FON, google and Skype

Filed under: Internet, Sussex University — Tags: , , — jesus @ 23:34

So, they did it

Fon has just announced it would get 21.5 million dollars from Google, Skype, and Sequoia Capital. Something I already talked about a couple of days ago.

Amazing. Just waiting for comments right now!.

January 29, 2006

About FON

Filed under: Internet, Random, Sussex University — Tags: , — jesus @ 21:00

As part of my studies in my e-business
course at the University of Sussex, I have been asked to write a report of a company that is somehow involved with e-business (e-commerce, use of technology to become more profitable or efficient, etc.).

So, I have decided that I am going to do some research on FON, a company created by Martin Varsavsky who is famous for founding Jazztel and Ya.com, one of the biggest phone companies and ISP in Spain.

FON idea is to profit from creating a network of wireless users around the world, called the p2p of wifi.

My task is, first of all, writing a background briefing report about the company, and afterwards an investment recomendation based on my research and study of the subject.

I chose this company because I have a friend who is working there and because I think it is interesting to see if FON’s business model is profitable or not. By the way, I have no access to internal sources as Teo (hi Teo!!) is quite secretive about it (I think because of NDAs).
Anyway, I’ll post more about it in the near future, I guess. Stay tuned.

May 27, 2005

Guadalinex 2005

Filed under: Internet, Linux — Tags: , — jesus @ 15:19

So, finally, the new guadalinex is going to be made by emergya, the company I worked for last year.

It”s going to be made in association with some other companies: Yaco, Interactors and, also, Canonical, the creators of the Ubuntu linux distribution, on which the new guadalinex is going to be based on.

Good luck to them!

« Newer Posts

Theme: Shocking Blue Green. Blog at WordPress.com. Fonts on this blog.

Follow

Get every new post delivered to your Inbox.