Archive for January, 2007

My “Wow!” moment learning Ruby

Friday, January 12th, 2007

I´m a absolutely beginner in Ruby, but recently, after I changed to my new web host, I was tempted to make a try.

My background is mainly in Java language, with some SmallTalk just because since I left my graduation, I never saw a so beatifull syntax. And until now, I write sometimes somethings in SmallTalk, when I need to think about the problem in hand without worry about the environment or language little tricks and details. After all, I think compiled and typed languages are doomed to become history.

My new web host gave me a lot of space and bandwidth, but only 2 practical choices of programming languages for dynamic web sites: PHP or Ruby. For personal reasons I hate PHP (please, I don´t want to start a flame war about what is the best language, this is just my personal opinion). My first job was about programming in PHP, and I don´t have good memories…

I´m studying right now, and for the people which come here looking for Java tips or tricks, I want to say I had a ‘Wow!’ moment right now. Look for this little piece of code:

1.class Product < ActiveRecord::Base
2.
3.	validates_presence_of :title, :description, :image_url
4.	validates_numericality_of :price, :only_integer => true
5.
6.	protected
7.	def validate
8.		errors.add(:price, “should be positive”) if price.nil? || price <= 0
9.	end
10.end

Don´t worry about what this code do. But the syntax is very cool, isn´t it? For me, some constructions doesn´t seem so strange, because they look like Smalltalk, but for a JavaMan… You can see the ‘if’ block? Or the method call in a null reference?

I see you in my next “Wow!” moment…

A SCJP question about Java enumerations

Thursday, January 11th, 2007

Given the following:

1.public enum Wallpaper {
2.  BROWN, BLUE, YELLOW;
3.}

Which of the following are legal?

  1. enum PatternedWallpaper extends Wallpaper {
    STRIPES, DOTS, PLAIN;
    }
  2. Wallpaper wp = Wallpaper.BLUE;
  3. Wallpaper wp = new Wallpaper(Wallpaper.BLUE);
  4. void aMethod(Wallpaper wp) {
    System.out.println(wp);
    }
  5. int hcode = Wallpaper.BLUE.hashCode();

Answer: only items 2, 4 and 6 are corrects. We can´t extend or instantiate an enumeration. The item 2 is the correct way to get a enumeration reference, the item 4 shows a method receiving a enumeration reference, and finally item 6 shows a call for the hashCode() method that all enums inherit from Object.

Approximate strings joins in a database - Part 1

Wednesday, January 10th, 2007

Strings are ubiquitous and ambiguous. When a communication channel is established between two people, inevitable noise and misunderstanding can introduce many errors when transferring textual data.

In any enterprise system, there are many places where this type of error can occur. Client names, addresses, company names, etc. This kind of error can become impracticable the exact match against a query for a registry.

Think about the poor life of the mega-action star Arnold Schwarzenegger, how many times he needs to repeat his name to the operator?

To solve this type of problem, a number of phonetic algorithms was developed. A phonetic algorithm use rules to transform substrings into phonemes, trying to unify two strings written as spoken. Some algorithms:

But all these algorithms suffer from the same problems. The rules are written specifically for one target language, and the most common target language is English. This isn’t a matter if your native spoken language is English, or if you don’t need to do internationalized applications…

There are another solutions? Sure. Searching for alternatives, I found the approximate string search algorithms, mainly the Levenshtein distance (or Edit-Distance) algorithm. More robust and reliable, can be used without changes. With a little, but very important, advantage: we can use to sort the result candidates by your distance to the searched string. This little advantage can be the difference between a poor result, with pages and pages of useless results, or a well ordered list, with the most relevant results first.

I expect to show how to use this great algorithm to construct a full text search engine on databases, soon.

References:

A SCJP question about arrays

Wednesday, January 10th, 2007

Question: Given the following:

1.public class CommandArgs {
2.   public static void main(String[] args) {
3.     String s1 = args[1];
4.     String s2 = args[2];
5.     String s3 = args[3];
6.     String s4 = args[4];
7.    System.out.println(" args[2] = " + s2);
8.  }
9.}

and the command-line invocation,

java CommandArgs 1 2 3 4

what is the result?

  1. args[2] = 2
  2. args[2] = 3
  3. args[2] = null
  4. args[2] = 1
  5. Compilations fails
  6. An exception is thrown at runtime

Ok, this question is trying to cheat you. Look well. First of all, the first index in a Java array is zero. Then, the command line showed sets args[0] = 1, args[1] = 2, args[2] = 3 and args[3] = 4. The assignment:

6.     String s4 = args[4];

Causes a RuntimeException, or more precisely, an ArrayOutOfBoundsException. The correct answer is 6.

References:

A SCJP question about right shifts and two´s complement aritmetic

Wednesday, January 10th, 2007

Which of the following expressions results in a positive value in x?(Choose all that apply)

  1. int x = -1; x = x >>> 5;
  2. int x = -1; x = x >>> 32;
  3. byte x = -1; x = x >>> 5;
  4. int x = -1; x = x >> 5;

To solve this question, we need to know:

  • All numerical data types are stored as bit patterns.
  • how to represents negative numbers using two´s complement.
  • Integers are signed 32 bits values, bytes are 16 bits values.
  • All numerical operands returns a integer value.
  • >> is a right shift operand (keeps the leftmost bit unchanged), >>> is a unsigned right shift operand (left-bits are zero filled).
  • When the value to be shifted (left-operand) is an int, only the last 5 digit bits of the right-hand operand are used to perform the shift. The actual size of the shift is the value of the right-hand operand masked by 31 (0×1f). ie the shift distance is always between 0 and 31 (if right-operand is >= 32, the shift is the right-operand % 32)

With all this by heart, it becomes easy now.

In the first item, we need to use two´s complement to know what is put into the x variable. To represent a negative number, first we invert each bit, then add 1. In this case, 1 = 00000000000000000000000000000001, inverting becomes 11111111111111111111111111111110, then adding 1, 11111111111111111111111111111111.

The operator >>> is a unsigned right shift. In other words, the shift affects all the bit pattern, including the leftmost bit. Doing 5 unsigned shifts in 11111111111111111111111111111111 returns 00000111111111111111111111111111. I´m too lazy to make the aritmetic, but this number is positive for sure (you see the leftmost bit?). The first alternative is part of the answer.

In the second alternative, how the right-operand is equal 32, the shift is 32 % 32 = 0, then the bit pattern stay unchanged, returning a negative value.

The third is easy. Remember: numerical operands returns integers values, always. Then, without a cast, the third item returns a very nasty:

Type mismatch: cannot convert from int to byte

The fourth item shows a signed shift operand, then, doesn´t matter how many shifts, the result is negative.

Conclusion: only the first item returns a positive value.