Being subversive with Subversion: Mercurial in the middle

Posted by Nick on February 25th, 2010 filed in OS X, Programming, Python, Software Development
Comment now »

DVCS is a popular topic these days on the internets, but for those of us who code for our day job find that more often than not we have to use the tried-and-true standard of something like CVS. Or if you are lucky, SVN. Personally I think SVN is pretty good (especially compared to CVS), but the other day a feature of DVCS’s caught my eye.

The ability to branch quickly and in a lightweight manner is one of the major selling points of systems like Mercurial and Git. Recently I’ve been thinking about trying out some refactorings and that can be difficult to do when you need to make sure that you can have a build-able version of your code at all times. Some things just take time, and the whole point of refactoring is to improve your code so rushing through it to meet an unrelated code deadline just doesn’t make a lot of sense.

Enter Mercurial.

There’s a project for Mercurial called hgsubversion that will allow you to pull from a subversion repository and make a mercurial repository locally. (Yes, git has something similar) Then you can hack away to your heart’s content using hg to branch and keep track of changes without pushing the changes out globally to the other SVN users.

This is exactly what all of the cool kids using hg, git, and bzr have been doing for years. Now those of us who talk to SVN can leverage this technique to bring a little more awesomeness to our day-to-day work, and no one is the wiser. At least that’s my theory.

Installing hgsubversion on a Mac or in Cygwin is like pulling teeth. I take that back, pulling teeth is not as painful. Plus you can leave your teeth under your pillow for some cash… but I digress. The best way to install it (for me at least) is to do the following:

  • Have the following installed: mercurial, easy_install, the XCode tools (command line tools like gcc)
  • Get a copy of the python swig bindings for Subversion. Collab has prebuilt binaries that seem to work best. I just copied the bindings from /opt/subversion/lib/svn-python/ to my python site-packages directory. I installed Python 2.6, so for me that path is: /Library/Frameworks/Python.framework/Versions/Current/lib/python2.6/site-packages
  • Get hgsubversion by typing at the command line: hg clone http://bitbucket.org/durin42/hgsubversion/ ~/hgsubversion
  • cd into ~/hgsubversion/ and type in python setup.py install
  • For me, the system had to download a few things and build them, so make sure you have XCode and the 10.4 SDK installed. (I’m running XCode 3.2.1 for this exercise)
  • Hopefully everything completes normally. If it does, do a victory shot. If it fails, do 2 shots and then try to figure out what went wrong.
  • Next make a file in your home directory called .hgrc
  • Inside the file put the following:

[extensions]
rebase=
svn=/Users/<YOUR USERNAME>/hgsubversion/hgsubversion

  • This should tell hg that you’ve got something extra for it. I went blind sometime during my attempts to get this working and had a typo in that line that kept me from getting it working for an embarrassingly long time.
  • If you have done everything right, the moon is in the proper phase, and the wind is blowing from the NW at 7 mph, then this command should let you pull down a project from Google code:

hg clone svn+http://<A PROJECT OF YOUR CHOICE>.googlecode.com/svn <WHATEVER DIR>

  • The end result should be a mercurial repository on your local machine. Refer to the link for hgsubversion for more details about commands and links to other installation instructions. (Some are older than others, and the commands have changed a little bit over time.)

So, that’s what I’m up to. Hopefully this will let me do some interesting experiments locally without sacrificing the safety blanket of using version control.


Test with big data sets

Posted by Nick on November 21st, 2009 filed in Programming, Python, Software Development
Comment now »

Every so often I re-learn this lesson: Make sure you test your code with the same amount of data that your users will use.

Developing with small data sets is fine, and most of the time that is what you want to do as you work out the kinks of the code. But when it is time to ship the code, you must test with a large data set.

For rent my resume.com I’ve been testing one portion of it with 4 different size documents. This is a fluke however, I chose the documents based on the contents not the size. Today I made a simple change that turned out to have some unforeseen ripple effects.

When I write unit tests, they seem to fall into two categories: Tests where I just do a simple check on the size of a return (i.e. did I get 21 items in the list?), and tests where I check the contents of the resulting data.

As a side note, you should really do both kinds of tests, not one or the other. Simply checking to see that the right number of things was returned is no substitute for making sure that the correct data was actually returned!

Today I was re-running my pyunit tests and one of the four failed because the size of the returned list wasn’t what was expected. By coincidence this happened to be the biggest data set I was testing with. If I had not had this test, I would have thought everything was ok, but it truth my modified function is returning questionable data!

There are of course other benefits of using larger data sets, mainly seeing how your code works under stress. What works fine for 10 items might not be so great with 1000 items. Testing with a large data set at the end will help catch these problems and will also help you adhere to Knuths’ advice to not optimize code prematurely.

So, now I’m off to learn how to trace through python code to find out how such a seemingly simple “fix” to my code could so subtly break it…


More on breaking functional fixedness

Posted by Nick on November 2nd, 2009 filed in Thinking
Comment now »

Here’s a fun little video that poses an interesting question. If you had $5 and 2 hours, what could you do to raise the most money?

Start up studies: A pop quiz

Aside from various illegal schemes, the people in the story came up with some fairly inventive ideas. I hate to use the phrase “out-of-the-box” but that really sums up the thinking approach the participants used.

Having said that, I thought the restaurant idea was better than the “winning” idea. Why? It provided a service of tangible value to a larger group of people, and is something that is probably reproducible (i.e. you could probably do that over and over).

And as the presenter pointed out, sometimes we put constraints on a problem that are totally of our own making. Breaking free from those can lead to some really interesting (or profitable in this case) solutions!

See also: Overcoming functional fixedness


Rate-my-resume.com is now live

Posted by Nick on October 11th, 2009 filed in Productivity, Programming, Python, Software Development, Web
Comment now »

In my last post, I put up a link to a little project I’ve been working on. I finally got around to giving it a proper name. (re)Introducing:

Rate-my-resume.com

Now if you are wondering if your resume is a good match for a particular job posting, you can use my site to find out! At the moment I’m giving the score in terms of 0 (being a total non-match) to 100 (being the absolute perfect match). In this economy, the more your resume reflects the skills listed in a particular job, the more likely your resume will be looked at seriously.

If you run your resume through and it gives you a low score, look at your resume and the job posting and try and figure out what keywords are in the job posting that are not in your resume. Then, assuming you have the necessary experience, put those keywords into your resume! Be sure to add them in a way that makes sense to a person, after all humans (especially HR people) don’t like to read fragments and words peppered into someone’s resume.

Try out the site with your resume and see how you rank!

p.s. Python rocks!


Matching resumes to jobs

Posted by Nick on October 3rd, 2009 filed in AI, Google, Python, analytics, django
Comment now »

Have you ever looked at a job posting and tried to figure out if you are a good match for that job?

I’ve written a Google App Engine application to try and help people figure that out. Paste in a copy of your resume and a copy of the job description, and it will try and figure out how well of a match you would be for that job.

Check it out: http://app.ironboundsoftware.com

I’m really impressed with the Google App Engine environment (go Python!) and had fun writing this. Hopefully this will help people out in their job hunt. Times are tough, and hopefully this little application will help someone get into the perfect job for them.

Try it out and let me know what you think!


Java Set

Posted by Nick on August 16th, 2009 filed in Java, Programming
Comment now »

Having a list of items is pretty useful. Sometimes its really useful to have no duplicates in that list. Java helps you to do this via the Set interface and its various implementations.

The Set interface basically defines a class that will hold a set of objects, and in the process not allow duplicates. (If you are ok with having duplicate items look into using something like ArrayList.) Like other things in the Collection family, Sets have an iterator() method that will provide you with an iterator so you can access the items being held by the set.

One example of a Set implementation (and probably one of the most common implementations of Set) is the HashSet. HashSet simply stores objects passed to it via the add() method according to its own internal heuristic.

If you need to control the order that items are read from the Set (i.e. the objects should come out in alphabetical order) then the TreeSet class is the weapon of choice. TreeSet uses a Comparator that you can set (optionally) that will allow the set to order the objects as they are added. This is enormously useful when your code receives some data from some source (a database, a web source, etc.) and you want to make sure that the data is sorted and is unique.

Sets pop up in several places in Java, one of the most notable is in Maps. Since a map is a key-value data structure, the keys should be unique. As a result, if you call keySet() on a Map you will get a collection of the keys for that map, and it will be a Set object.


Hero of the week: Stewart Butterfield

Posted by Nick on June 12th, 2009 filed in Entertainment
Comment now »

Not only did he help create a kick-ass useful website (flickr), but he knows how to respond to mis-directed emails:

http://valleywag.gawker.com/5288759/flickr-founder-calls-nuked-user-a-dick


Lightweight TDD

Posted by Nick on May 25th, 2009 filed in Programming, Software Development, Thinking
Comment now »

The more I used Unit Testing (particularly JUnit) the more I like it. It is a great way of tracking progress in your code, and more importantly making sure you haven’t broken something in the process.

I’m not a huge fan of traditional Test Driven Design (TDD) though. My biggest complaint is writing a battery of tests before writing the actual code feels like putting the cart before the horse. I’ve been experimenting with it, and most of the time I’ve found  that if my code is structured “correctly” (i.e. a well defined API/interfaces, dependency injection, etc.) TDD will work pretty well.

However I have found that I like to use TDD one test case at a time. Basically I will get the basic framework of my class(es) together, and then as I refine the capabilities of the class, add in a few tests to catch one or two conditions. Then I work on my code to make sure that it is performing as expected. Once everything is going well the tests pass and it is time to move on to the  next part of the class.

The big advantage for me in this is that I only have to worry about getting a small number of tests to pass instead of all (or a large number) of them. By breaking the tasks down into smaller pieces I find that I spend less time “dreaming” about how my code would work (and writing tests that don’t accomplish much or have to be re-written as reality sinks in). Instead I’m able to focus on a single problem and solving it.

This is a similar approach as to what is advocated in the unit testing community: When a bug is found, create a test that exposes it. Then fix the code and the test should prove that the underlying problem is gone. I really like that approach and I have begun doing that as often as I can. So far it has really paid off in terms of making sure my code doesn’t have bad case of “but-I-already-fixed-that!” type of bugs.


Comma Separated Values

Posted by Nick on April 11th, 2009 filed in Productivity, Programming, Python
Comment now »

Question: How much does python rock?

Answer: More and more every day.

Today I was writing (for what seems like the millionth time) a little script to read CSV (Comma Separated Values) file. After running into the same issues over and over (picking a delimiter, escaping delimiters, etc.) I decided my sanity is worth the 30 seconds it would take to see if someone else has already written a CSV library. It turns out python has one built in. Since 2.3. D’oh.

import csv

lines = csv.reader(’myfile.csv’)

That’s all that’s needed to read in a csv file and have it properly handle the delimiters, even when they are inside of escaped text (i.e. something like “$3,000″ will be read as $3000 instead of $3 and 000).

Python rocks again.


Lasik: Still loving it 2 years later

Posted by Nick on March 11th, 2009 filed in Technology
Comment now »

About two years ago I finally decided I was tired of my scratched up glasses and that I would get Lasik eye surgery. I was talking about this with some friends recently and thought I should do a post to talk about how things are now that I’m a few years away from it. I did a lot of research on Lasik before my operation, and a lot of what I found talked about the procedure itself and the immediate time after. I thought I would write this blog post and talk about how things are a while after the surgery.

In short: Pretty good!

As a computer programmer I’m rather attached to my eye sight. I was concerned that staring at screens all day might be in for a rough ride because you will have “dry eyes” after the surgery. Some days were tough, but using the preservative-free eye drops (like the doctor suggests) really helped with this. For me the dry eye problem went away pretty quickly, I would say within a month or two I was only having to put the drops in a few times a week as opposed to a few times a day.

The doctors said my eyes were in good shape and it really showed in the recovery phase. The only thing that seems to heal slowly for me was my night vision.

Loss of night vision is a common side effect of Lasik. My night vision has gotten better since the surgery, but it took almost a year, and it still doesn’t feel quite the same as it did pre-surgery. The flip side of this is that in low-light conditions, I feel like I can make out some details a little bit sharper than I could before. I know that sounds odd, but it seems like as long as there’s some good like like a quarter-moon or so, I feel like I can see better than I could with my glasses in that same condition.

So, all in all I’m pretty happy with the way things turned out. Some people do suffer from negative side effects for longer or more intensely than others, but I think that as long as you follow your doctors instructions:

  • Moisturize (your doctor will tell you how, usually with preservative free eye drops)
  • Don’t rub your eyes! :)
  • Use the medicines and cremes as directed by the doctor.
  • Take your vitamins, eat your Wheaties, and get lots of rest.

So that is pretty much my follow up report. I’m glad I did it and I encourage others to talk to their eye doctor if they are thinking about it. I had my surgery at LasikPlus here in Atlanta Ga., and the staff was great and very helpful. Check them out if you are thinking about it!