On Software and Other Animals: 2008

Thursday, December 25, 2008

Test Automation ROI

Recently I was working on a ROI (Return-on-Investment) Model for Test Automation. It was quite interesting to note that there are not too many "ready-made-excels" out there for this purpose. The ones that exist come mainly from automation tool manufectures. So I went to creating my own model, resulting with interesting notions.

Test Automation ROI Ingredients

Expenditure side

NRE (Non-Recurring Effort / Expense) - buying tool license, training people, raising the environemnt and infrastructure
Implementing the initial testing scenarios
Maintaining the testing scenarios and the environemnt and infrastructure
Adding new scenarios and new required features to the environemnt and infrastructure
Usually requires better trained testing engineers to maintain the automated testing system

Revenue side

In the long run, may be able to reduce manpower (less manual testing is needed)
Reduce number of escaped defects
Increase development confidence, allowing more features per release, thus less releases per year
Shorter test and development cycles and improved TTM (Time-to-Market)
Enhances the abilities of the testing team ("soft" revenue)

Influencing Factors

When coming to real life numbers it appears that it is very tricky to achieve positive ROI. And even if there is positive ROI the break even point comes after at least one year of investment, and in most cases two years. It's a simple case of big lump sum investment for future expected revenues.

The ROI is strongly related to the amount of bugs detected in the system in general and specifically in the field, number of existing manual testing engineers and the number of releases per year:

A system with very low amount of bugs that are mostly found at testing room and not in the field, would make it harder for automation to have positive ROI, as the potential of reducing escaped defects is low
If the current existing number of manual testing engineers is small, potential of revenues in reducing manpower is low. However, if number of escaped defects is high automation can assist in lowering it
Low number of releases per year would also make it harder to achieve positive ROI, as the lump sum investment is made for less cycles of use per year

The ROI is strongly related to the changes in the system:

Need for new scenarios and features in the environemt along the year means more investment in maintaing the testing system up-to-date
The lifetime of an automated scenario before it becomes broken or irrelevant because of changes in the system is very important

On an "average" system with some major new features each year, about 4 major releases per year and about 20 major escaped defects per year, following can be found:

In order to properly support and maintain a test automation project, you would probably need the same amount of people as you had before with manual testing, all along. The new features in your product requires new testing scenarios as well as new features in the test automation environment
The major revenues stem from reducing number of major releases (adding more features per release) and from reduced number of defects in field
It takes about 18 months to return the investment

Projects that are not stabilized yet and have too many changes would probably not earn from end-to-end test automation, although automated regression based on UnitTesting can still assist.

Wednesday, December 17, 2008

Module Separation

Module separation is one of the oldest tricks in software engineering. As it is too complicated to attack a big problem, we split it and attack it in pieces. The pieces should eventually work together. We don't want to have two teams digging a tunnel from two sides of the mountain, missing the meeting point. In the tunnel case you get two tunnels which may have some value, with software you will just have a dysfunctional messy system.

The engineering task, some would say the art, of defining the modules and the interactions between them, is crucial for the success of the system. Do we need our modules to be stateful? Do we want synchronous or asynchronous communication? What would be the messages language between the modules, would it be programming API or a communication layer? On top of which protocol the messages would travel? And on top of all, which modules should we define? and for each module, which modules it shall be aware of and which it shall be totally agnostic and unaware of?

Usually we would like the communication between modules, i.e. the public interfaces that a module should expose, to be as little as possible. Operations or sub-operations that require much communication would be preserved in the same module while operations or sub-operations that require lesser amount of communication can be split. We should also like to have some hierarchy between modules so that not every module would have to know all the others, preventing cyclic dependencies is usually an architectural requirement.

So, what is the definition of a module?

Trying to create a definition of a module, or even better, a subscription, is not easy. Recently I was attending a session by Rick Kazman the co-author of "Software Architecture in Practice". Rick referred to this question, defining a module as "the set of tasks belonging to a unite topic, managed and built by an independent team of people." It is interesting to note that Rick's definition focus on the organizational aspects more than on architectural aspects of "what is a module". Rick argues that this goes well with Conway's Law stated by Melvin Conway in 1968, arguing that the required communication between two teams has a direct relation to the required communication between the modules they develop.

Create your own module

A module should have a clear definition of roles and responsibilities, both technical (what the module does) and organizational (who is responsible of defects found at this module). It should have also a clear and concise public interface. A module would usually be deployed as one unit, but in many cases several modules would be deployed as a unit together. Different programming languages would give modules different semantics, syntax and deployment characteristics, yet still some programming languages may not have the notion of modules in the language itself. And, not less important, a module shall be clearly defined as a set of tasks managed and built by an independent team of people.

How to do a 2000 pieces puzzle in a team

Doing a huge puzzle in a big team, does increasing the size of the team assist? Is there a number at which we will have a negative marginal contribution? ("Mythical Man Month"). Can you create "modules" here? This question about puzzle building is inspired by Chris Rupp's talk at JAOO 2008 (see video capture here).

It's a very nice exercise for a management course. I would think of the following design: (a) dispatcher team - goes through the pieces and assign them to the other teams; (b) teams by colors and shades, e.g. Dark Blue team would deal with the sea part while the Light Blue team would deal with the sky; (c) frame team, getting all the pieces that shall compose the frame. Each team is a module, so we have a dispatching module, a few color building modules, by colors and shades, and a frame building module.

In case of resource shortage we can settle without the dispatching team and let the building teams themselves do the dispatching. Number of people will define the separation into colors for the color building teams, but even if we would have 1,000 people we would not define 10 shades of blue, as the interaction between the teams (I got this blue piece but it seems unrelated to me) would be too exhausting.

Shall module definition be influenced by external factors?

We saw in the puzzle example that to some extent, number of available people would influence the number of shade teams that we may define. Expertise and skill set of our resources, as well as their geographical disposition shall also play a role here, but we will leave this for a later post on the relations between organizational structure and architecture.

Sunday, November 30, 2008

Use Explaining Variables!

Explaining variables are temporary variables that a method doesn't really need, but makes it more readable. I like them very much.

"Introduce Explaining Variable" is in fact a refactoring method in Martin Fowler's famous Refactoring catalogue -- p. 124 in the Refactoring book (it's a bit against Fowler's "Replace Temp with Query" -- but this is controversial anyhow -- I for myself like temp variables in any shape and form! As long as they have meaningful name of course...).

Why temp "Explaining Variables" are great?

It's better than a comment, the name of the variable says what you did in this funny math operation or complex logical act; comments are neglected sometimes, changing the logic and keeping the old, not accurate anymore, comment. The chances for a variable saying "isTheWorkerAvailableAndWithoutConstraints" to really mean what it says are much higher than a mere comment - we can count on the programmer that if something changes the name of the variable would change. Or we shoot him if not. With a comment we would feel bad to shoot a good programmer just for not updating a comment.
It goes anywhere. You don't need to try and recall "so where are we now" or "what did we put in this variable" - the long funny name says it all, and says it everywhere it goes!
The name can carry with it important info as measurment units. I'm extremely fond of variables with names like "timeoutInMillisec" or "distanceInMeters" or "radiationTimeInSeconds". It can save lives. Or satellites.
When calling a method it's sometimes more readable to send a temp variable with a decent name to hold the argument you want to send, rather than just send the argument. For example, a variable called "shouldSort" set to false may be much more readable than just sending "false". The programmer may also add, above the init to false, why false is the right value to send.

Usually during code review, when a small thing is not clear, I prefer an explaining variable over a comment.

Friday, November 14, 2008

Make sure to have a strict XSD!

I'm going to tell you three things here:

Don't invent new languages!

XMLs are also languages (if they describe flow control)

If you do create a new XML based language, make sure that it has a strict schema!

-------

Do not invent new languages

We have enough of them. And by a "new language" I mean anything that describes a flow, that includes all kind of weired XMLs. In some cases people call these XMLs "configuration" but when I look into it, it's NOT. Configuration is when you want to set a value to some attribute. If it is a complex attribute, e.g. a network topology, you may need to use XML, as you need a way to indicate hierarchy. But when it comes to defining a flow, and you have conditions maybe even method calls and loops, this is a new language! don't underestimate it.

What's BAD with creating your own new cute language?
(I'd even phrase it: "Why DSL sucks!")

Suppose that you have a service and you want to allow the user to externally configure its flow. Now you don't want to expose your code and allow the user to change it (that would be indeed a weird idea). And you don't want to load in run-time some external lib or dll or java class, which the user can compile and add. Which again is a reasonable decision. You want to keep it as simple as editing a text file. So you create your own little XML based language and it works great. The problem starts when problems start. And it's not a tautology. When something is not working correctly with your cute little language you find it very hard to debug it, as you don't have a proper debugger, nor even a proper IDE for it, not to speak of reasonable intellisence auto-complete. Your cute language is orphaned, to properly support it you will find yourself investing huge efforts, much beyond what you have planned originally.

What's the alternative

The best way, if you wish to have external flow configuration that includes conditions and such, is to use some existing script language and call it from your program. Perl, Python, JPython, of course Groovy, and others, can be good. The advantage is that you get a mature language with all the required surroundings. You can run the script from within your program and can get back feedback on variable values or return code. And it is much easier to explain the language to your user, you just point him to the language tutorial.

But if you do need your own new XML based langauge

Make sure that the language has a very strict XSD:

Use enums for list of values, so the user cannot select anything, only what's legal

Use types: never allow to get a string for an int!

Use regexp to define the allowed patterns of string values

Don't use the same attribute for two purposes, it breaks the schema: if you can get either an int (number) or a string (variable), use two different attributes, each will have its own rule

Use different XML elements to oblige set of attributes: suppose that A-B and C-D are attribute couples (i.e. if I use attribute A I must also provide B, and if I use C I must provide D), it's better to break them into two different elements, even if they are similar in nature

A strict schema gives you a good language protection, thus less bugs, a decent intellisence auto-complete if you use a decent XML editor, and a clear documentation for the user of what's right and what's wrong.

The same as any respectable language has its BNF, if you invent a new XML based language you should base it on a strict schema.

Wednesday, November 12, 2008

* It's done already!

Shlomi Hazan, a student of mine, was dragging his workshop for a second year. It was his last and final course. Finally, Thank God!, he submitted his work.

Here he is, with 2 laptops he brought with him, one desktop pc, one AudioCodes GW, a SIP phone and some POTS phones. He is in the computers lab, so though he did bring a lot of equipment, some of the equipment you see is part of the lab...

Now, I'm enthusiastic, not only because finally Shlomi decided to finish his degree and submit his workshop, but because it was a very nice workshop without too much code.

Shlomi presented a SIP Conference Call server. The thing is that using Asterisk you can get almost all of it done (and if you need for some reason a JAIN SIP stack, you will find an open source for this as well).

I think that asterisk is an amazing open source project.

Thursday, October 23, 2008

NotifyingBlockingThreadPoolExecutor

I've found this interesting article in java.net, extending Java's ThreadPool.

Sunday, October 12, 2008

Don't Make Me Think!

I'm reading the great "common sense approach to Web Usability" - Don't Make Me Think book, by Steve Krug.

And I recall that I have the exact same message when reviewing code: Don't Make Me Think. If I have to think more than a second on a line of code it means that it's too complicated, you need to break it to two, maybe add a good comment above.

If I start asking questions like "why did you do this or that?" it's not good, it means that your code is not self explanatory, or that you had a good reason for doing something in a special, maybe unorthodox way, or just not the way I'd think of doing it, but you didn't explain it well enough in a comment.

When reviewing code I usually ask the developer, "OK open me the IDE, now let me try to understand by my own", "don't explain anything". "Shut up, stop explaining what you did, don't you understand I want some quite?!"

Then if I understand all by my own without a need to use my brain too much it's a good sign. The programmer gets his candy and we say goodbye. Otherwise the big refactoring, commenting, arranging work starts. Together with some oriental songs mumbling.

Friday, October 10, 2008

Selecting the Right Technology

or - the Art of "Satisficing"

These modern times are unmerciful. Back in the old days, when you had a programming task you could select between punching the pink punch-cards or the yellow punch-cards. Later, when Java came-up, if you selected Java, the library selection was quite easy: you went with what Java is providing – this was in fact one of the greatest things with Java – its built in set of common libraries, so there is one standard for IO work or Database connectivity. Any new programmer that comes along either knows already the standard or he should learn it. But no smartass would come knowing some all-different thing, asking dumb questions like “why are you using this and not that?” – luckily for us, there was only one option for everything.

But we are now at new times. For every fart you want to make, there are several technologies you may use (pardon for the word “technology”). And selecting the right one is in some cases more troublesome than just doing the damn thing.

Selecting the right technology: well, here is the “full process”:

Divide the thing you are going to build to relevant “topics”, each may need a whole different another set of technologies – you are anyway going to divide your project to modules or tasks. “Topics” may include: logging, configuration, communication channel, XML parsing and manipulation, database connectivity, caching, pooling, redundancy and failover support, build infrastructure, continuous integration support, unit test framework and code coverage support, static code analysis tool.

Build the list of ALL relevant technologies, for each relevant topic – don’t miss any relevant technology as it might be the one for you!

Learn all technologies: write some code with each of them or at least read some example of code to see how much you like it; read analysis of pros and cons of each technology or do the analysis yourself; read some comparative benchmarks or better run them by yourself.

Create a nicely built comparison table for each topic, with all relevant comparison categories and aspects, create a team to discuss the comparison categories and give them weights, then give a (subjective) grade for each technology on each category and calculate the final score. If money is involved, add it to the equation somehow (not so simple, but with small amounts usually just added as another row to the comparison table with some grate from 0 to 10 or alike).

You have a winner, start working with it.

From time to time (once in a few months / once a year) conduct the process again. You cannot of course just rest on the laurels. New technologies keep coming, so you need to go again and reassess your conclusions. Of course, you may need to give some weight on “knowing already this technology” or “paying already for this license” or “doing the work already”, but it doesn’t mean that you would not choose to replace it after all with something better, maybe something with much better performance, nicer user interface or easier maintenance.

This is the full process. The problem with the full process is that it is VERY COSTLY. To conduct it thoroughly you may invest in the selection much more than the gain between optimal and random (or a better “smart random”).

The “full process” is probably the way for decisions with some several millions on the stake. And you would probably never go to #6 above, you would rest on your laurels after a decision is made, even if a new promising technology is just out – they always appear and it’s a never ending race, like waiting for the best electric equipment to be a bit cheaper, once it’s there you would wait for the better new model to hit the same price. This is the road for never deciding and never taking an action, or taking a decision but always changing your mind (a road that leads eventually to a dead end dwelling a lunatic house).

It’s simple economics, when you search for the cheapest Plasma TV-set, you MUST take into the equation the time spent and the fuel burnt. You may saved 100$ by investing 5 hours on the internet, 2 hours driving to the warehouse realizing that the thieves over there suddenly have a different price (“no, the price we gave you over the phone was for the TX model with no screen, you need the TWX which comes with a screen and is 400$ more, sorry!”). Eventually you get the TWX model that you want at what may seem as the real minimum price. But was it optimal?

“Satisficing“ is the keyword here. This word, invented by the economist and computer scientist, Herbert Simon, who is the only person who won both Nobel Prize and the Turing Award. It refers to the act of satisfying by selecting a suffice choice -- not necessarily the optimal one.

How to Satisfice?

Make sure that you do not miss the major relevant technologies, those that everybody talks about. For this you MUST make yourself well familiar with the subject.

There is a small chance that some un-spotted niche technology is your optimal choice, but since not many selected this niche technology, most chances it has some flaws. Missing a decent audience is a flaw as itself, which may affect later support, attention and availability of practiced manpower for this technology. So it's not a mistake to narrow the list to the major technologies that most people talk about and have the biggest communities, as long as you identify the domain correctly.

A big community using the technology usually means that this technology is flexible and versatile enough to accommodate all, which is an important sign. Your needs would probably also grow and differ along time, you cannot fully foresee it now. Using a flexible enough technology is the cure for unexpected future needs.

In many cases it's not a clear cut, nor is it a Black-n-White. You checked all aspects, consulted with colleagues and still hesitate -- flip a coin.

Standard protocols and APIs are for your rescue. If you have selected the wrong choice. A real crappy implementation. If it is based on a standard protocol or API you can just replace it with another implementation. Thus, do not tempt to proprietary extensions, stick to the standard. If there is an absolute need for using some proprietary extension (big efficiency boost or the only way to achieve something) narrow it to the minimum and isolate it in the code to a central place (e.g. one utility class which is the only one using the proprietary extensions). This way, if there is a need to change the implementation it would be easier to just re-implement the utility class.

Learning curve should be a factor. But you should remember it's a one time cost, compared to using the technology and maintaining the code, over several projects. Usually after you learn it, the complicated stuff becomes much less complicated not only for you but for the entire team, you find the way to wrap the complexities and make it easy to use. Give credit to the community, if it is said that something is good, wait before disqualifying it for being too complex.

Stick to things that work. If you see an example of two technologies working together (e.g. A and B) and a second example of another couple (C and D) but no other combination (no A and C or B and D), maybe there is a reason for that. Be caution before rushing to un-explored combinations.

If you do want to be on the edge and rely on un-explored territories (something without a clear and exact example) -- build your own simple example before taking the technology to the real battlefield of your full-blown project. And be nice, publish your example!

Consistency and uniformity are valuable! Try to select the same technologies for all your projects, thus you can share code, knowledge and bug fixes. Think twice before deviating to a new technology with investments in old substitute. Then think once more. Usually upgrading to a new version of a technology or moving to a whole different one is something to be done once in few years. If it happens to you twice a year check whether one of your programmers is checking all available technologies for some Academic assignment he is working on.

Be reluctant to adopt newly presented technologies in the market, even if everybody says it's the great new thing. Being on the cutting-edge may sometimes bleed! Technologies tend to mature, let others use the alpha and the beta. Wait for the 1.1 version. If you do want to take the new beta technology, make sure to test it well on a demo project which has the relevant features you need in your full-blown project.

Documented technology is always better than undocumented one. Prefer to ignore the less documented, even if it's cool.

Performance is important but is dynamic. Changes in HW and Computer Architecture (e.g. moving to multiple core CPU) may make a dominant choice descent. Updated versions of the implementations may also change the balance. Thus, usually if there is no real significant difference, of a multiple factor, performance should not play a distinctive role. Standard Protocols and APIs are more crucial.

Don't feel obliged to wait for a new version or a new technology that is about to come out, just because you want to explore it. It's like waiting on a new electric appliance model, there always be a new one coming shortly, for which you will feel again obliged to wait. Make your selections and go on.

In many cases it's not a clear cut, nor is it a Black-n-White.
You may feel here a deja-vu. Yes, I've said it already. But you must bear in mind that in many cases indeed, there is no right answer. This is another good reason to be conservative, keep with old technologies for which you have the knowledge and expertise, wait before adopting the cutting-edge, prefer documentation over nice fancy woos, prefer main stream over niche, prefer standards over proprietary.

Count on your gut feelings. But only after you do all the required investigations and exploring. Write some code, check the alternatives. But if after investing the effort you don't feel good with it - don't use it.

Thursday, September 25, 2008

RM-788, Knowledge Sharing and Wikis

I have a new "all in one" remote control. You get with it a small manual with instructions and operation codes. These things work that way: you need to set it to your appliances, so there is a specific code for each brand and model and you have to punch in the right code into the remote control, then if the code was accurate, you can start using the remote. In order to adjust it to the specific brand and model of your appliances you have to look for the appropriate codes, listed in the manual. So far so good. But it turned out that my TV set, PILOT of some unknown model, doesn't go with any of the codes listed in the manual for Pilot. One code did work, but only partially. Annoying.

What would you do at this point? One option was to return the remote to the store and make it ship back to Taiwan or China or wherever it's manufactured.

But then I remembered the great invention of the Internet. Maybe it will come to my rescue. So I Yahooed and Googled for the name of the remote control, RM-788 of Universal Electronics. And I got to some reseller sites, and even to the site of Universal Electronics themselves. But apart from seeing the picture of my remote there was no useful info available.

So I came back to the remote and to the manual. It appears that there is an auto-search function of the thing, which is a bit troublesome, being not really automatic, more like “semi-automatic” – the remote pass through all codes, but you have to check each code manually and then if not OK, you need to press the “Power” button to move to the next possible code. Since there are about 400 possible codes for TV sets I was a bit pessimistic. BUT, it did work! After a while I reached code 138 which worked perfectly for my Pilot TV (there is a way to know the currently set code, by counting number of led blinks for each digit, so the moment I got it working I wrote down immediately the magic number 138).

Now comes my social responsibility. I wanted to share this knowledge somewhere, for the benefit of other Pilot-TV owners who may try to use this remote (even if one in the entire universe!). Needless to say that the manufacture site doesn't allow to add comments or tips. Probably they are afraid of obscene language comments, and don't have the resources for an editor. (Do you know of any company that do provide such a “drop a tip in my site” service?)

There was one site on electronic appliances which did have the ability to add tips and even had the RM-788 remote listed in his catalogue! Unfortunately when trying to submit a tip the site ask me to register and fill in a ton of details (stuff like my mother maiden name, oh madden!) so I left this unwelcoming site in a hurry and rushed to my blog.

Which made me think -- We need a Wiki here.

Just to clarify the terms. Speaking about Wiki and the Wiki idea, I refer to the concept of free editing with immediate publication. For some nuances a few operations may require prior registration, but not for adding or editing content. When editing is allowed only for the site owner, or is not immediately reflected in the site, then it's not Wiki for me, it's just a Content Management System, based maybe on Wiki framework and syntax, but not on the Wiki idea.

The Wiki idea was invented as early as 1994 (see: http://en.wikipedia.org/wiki/WikiWikiWeb).
It came into wide notion with Wikipedia. Followed by many other Wiki sites, some as part of the WikiMedia foundation, others include Star Trek Wiki, World of Warcraft Wiki and many more (some lists can be found here, and here, and here, and here, and of course here and here).

Why Wiki is the best tool for knowledge sharing?

Because it allows someone with good will but no patience for bullshit to just get in and share his knowledge. Publishing the submission right away forces the submitter to pay more attention to his submission. Usually one would invest more time, when knowing that no other person is responsible of editing his post and fixing any spelling mistakes. Of course, someone may need to follow new or edited articles and validate them. But we don't have to wait for the validation in order to see how the new contribution looks, publicly. This is the Wiki idea!

The fear of vandalism, by the way, is almost irrelevant, when it comes to internal knowledge sharing inside organizations. This is the reason Wiki becomes more and more as the method of choice for knowledge sharing internally inside big corporations.

As for the RM-788 Pilot-TV code, it appears that there is a Wiki for Product information, called simply ProductWiki. It's commercial (not a non-profit). I would have drop my tip there, but it didn't have the spoken RM-788 listed in its catalogue, and required me to register for creating it, so I quit.

For more on Wiki as a tool for knowledge sharing, and other good references for managing information in the Wiki way:

http://www.ikiw.org/
-- and specifically: create-a-participatory-knowledgebase-on-a-wiki
http://www.wikipatterns.com

Tuesday, September 2, 2008

Google open sourcing their new browser, Chrome

Google new browser, Chrome, is the talk of the day. And the comics are indeed great.

Some were enthusiastic about Google opening this vast project as an open source. Noble indeed. Google themselves are feeling virtuous about it, or at least want us to feel that way. After all, a great tool like this (no one really seen it yet, but they do know how to create a buzz and I guess it will be indeed a good browser) – by releasing this great, superb, new-era browser as an open source, anyone can take it and adapt it to his needs, even Microsoft can, and maybe will even do.

So why did they do it? Why to open source? You can give it for free without opening it...

Slide 37 lists the noble arguments.

But could they really?
Do they have the option the ship a proprietary browser?

NO, because they don’t have one.

This new Chrome browser is said to be built upon WebKit and Mozilla projects. Both projects are open sourced and require derived work to be open as well (section 3.2 in Mozilla Public License and 2 of the LGPL).

They could have find a way to keep some of the code closed, like maybe the V8 javascript VM. But it would probably be too risky legally. And they might need their lawyers ready for Android issues (or for buying Sun).

Bottom line:

No need to be too cocky about open sourcing. As nice as it is that you open it, you probably had no other choice, other open source initiatives worked hard on writing parts of your code.

Monday, September 1, 2008

CPU and GPU, from nvision 2008

Nvidia had a geek party in San Jose, called nvision2008.
In one of the peak events of the conference, Adam Savage and Jamie Hyneman came to demonstrate the difference between a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU), using color cannons.
(See: http://www.nvidia.com/content/nvision2008/day3.html).

Cory Ondrejka, former CTO at Linden Lab and one of Second Life founders posted the following videos from the party, showing the color cannons in action: collapsing geography: geek celebrity. Make you understand the power of GPUs. An inspiring color cannon! Wow!

Friday, August 29, 2008

Adding abilites to Java's ThreadPoolExecutor

I had a discussion on Java's ThreadPoolExecuter, which seem not to have the ability to block producer from adding tasks when the thread pool's queue is full. I went through some possible solutions and I'm gonna write something about it, so please stay tuned (and if you have something to tip on this issue, say it now).

-----------------
Happy to update:
I've published it, in java.net, it is called Creating a NotifyingBlockingThreadPoolExecuter.

Thursday, August 28, 2008

Google, on the way down

I host my blog at blogspot, as you can see. Today I was editing the draft of the post "ChangeLightBulbWindowHandleEx" when suddenly, between one automatic save to another, the draft just disappeared. Vanished. Gone. I was left with the title alone, no content.

There is no versions saving in blogspot, so I couldn't go to the previous version. Then I thought maybe I can turn to blogspot support and ask them to recover my draft, they might have it there somewhere. Long shot, but worth trying.

I started to look for the e-mail of the blogspot support.
And, after a while I found the link saying Contacting Support. Great! So I followed it, to a nice page. And another nice page. And another. None of which offered an e-mail or a submission form. All tried to suggest advices for all sort of problems. There was a link leading to Blogger Help, but there again, no e-mail or decent bug report form. BLAH!

Then I started looking maybe someone already bumped into the same problem, maybe there is a hope somewhere. And I got to this mailing thread at the Google Blogger Help Group.

It could have been funny if it wasn't sad.

Don't you care about your bugs?

I know it doesn't really say something about the business model of google. But it says that they lost it somewhere down the road. Don't you have enough resources to care about your quality?

ChangeLightBulbWindowHandleEx
for Blind Catalan-speaking Spaniard
A response to mistralzonline on "Why is that?"

My friend Alex Romanov, in a recent post at his blog, asked why Nokia is not supporting a synchronization tool for his Mac, while he could easily find such a free tool on the web.

The answer for this question is found in the known article by Eric Lippert, “How many Microsoft employees does it take to change a lightbulb?”, making the math of how much resources are needed for a formal release of a simple thing. And since it takes a lot, you need not get off your course. Which can easily happen.

To summarize, the answer is: Focus.
Choose your lightbulbs!

I was once responsible of a State Machine module for a communication component. With the State Machine module we were providing also a nice editor tool, based on Microsoft Visio. The users liked the Visio editor which made life easier for them. We even thought of adding debug options to this editor to allow the users to run in debug mode in within our Visio tool. At some point I decided to stop the development and support of this tool and instructed all users to start writing their state machines in the raw XML that the module itself received as input. It happened when I realized that we are spending too much time on this nice utility. Every new feature in the State Machine module required a parallel development for the Visio tool. Too much support effort was invested in negotiating features with the users and bugs with the testing team, training and support, keeping up with the Visio versions and more. I realized that we do not have enough resources to support all of this (the resource was in fact one person, supporting by himself both the State Machine module and the Visio tool…, it was, by the way, the man behind mistralzonline from above, so he should know!). Focus is the word. When we realized that we cannot do it all at the required level of quality, we had to cut. And it’s better to cut the utility Visio tool rather than the module itself.

When Nokia decides not to provide the tool for synchronizing one of their models with Mac, it’s because they believe that with the resources they would put in it, they cannot decently do a good job. It’s not only development, it’s the entire life cycle, including support for users who get stuck (and do it better than google, a post on the horrible google support will follow soon).

Now, Nokia has the option of managing a community or at least pointing to the available solutions on the web. After all, there was such a free tool out there. It is interesting to note that not many commercial companies officially manage such a development community or point at available supporting tools related with their product. Sun does have the java.net, but it’s more like an open source arena.

Why is that?

Because for Nokia, pointing at a tool, even if accompanied with the common disclaimer waiving any liability or guarantee, still means taking responsibility. If the tool infringes privacy, causes loss of data or just simply doesn’t work right, the furious public will come to Nokia. To take such a responsibility Nokia would have to thoroughly test the tool, which brings us back to square one. This is why they choose to refrain and let you pick it up by yourself, which as appears you can do quite well.

Tuesday, August 26, 2008

Rolling the Dice - the Cube Project for Silverlight

I am teaching a "Product Workshop" course at the MTA college in Tel-Aviv.
Usually students submit something that can be used as a "black box" product, like: a web application managing event invitations, a simulator for physics rules and many other nice things.
Whenever a student wants to create a library or a toolkit (that is, something to be used by other applications), I urge him or her (well, it's a him, let's stop with this political correctness, when it is a she the product is a very nice well designed, all screen pink, web application for match-making...^*), well, I urge him to put it out as an open source.

So here is something recently created by my student, Itamar Kotler:
http://www.codeproject.com/KB/silverlight/CubeProject.aspx

It's a component for creating a rotating cube out of 6 photos, right into your web site.

Funny that it is not inherently supported by Silverlight (there is something in Silverlight but the argument is that it doesn't really support full rotation through all possible angles). I guess it will have soon a decent substitute as part of Silverlight, which will probably disappoint Itamar. But then again he can always argue that he drove Microsoft to implement and present a decent rotating cube.

^* Just before I'd be accused of being a male chauvinist swine, some of the best projects I get are from women. And I can live with pink screens.

Friday, August 1, 2008

Do you have an API?

This is so basic, yet still ignored so often.

When you buy a SW tool or thinking of developing one, one of the first questions to ask is: is there some kind of API exposed? CLI (Command Line Interface)? Web Services Interface?

A product without a clear good API means that you cannot customize it to your own needs, it's a "take it or leave it". Of course, you can ask for features and then wait in line.

A good API must allow to preseve states when needed. Suppose I want to perform a series of operations, like for example a chain of queries, I need to be able to send the 2nd query for filtering the result set without reminding the tool what was the first query. And the results should probably be cached for a specific amount of time. So I'd probably need some session ticket, request Id or alike.

With a good API I can use the tool in things that were not originally planned, and still do well. And when a question arises: does the tool know how to do this and that, the answer would much more easily be: yes, it does have the ability to programatically allow it on the client side.

Now if the product doesn't have an external good API, it probably says that it is not well designed with a clear separation of a Model View Controller (MVC) architecture.

Bottom line

Even the most graphical product MUST have a model and an external API.
Start always by thinking on the API or CLI version and ignore the presentation layer.
And think twice before taking a product that can be opearated only via its specific UI environment.

Think - MVC, MVC, MVC, MVC, MVC, MVC, MVC, MVC, MVC ...

Monday, July 21, 2008

To AJAX or NOT to AJAX?

I had today a discussion on when to use AJAX and when not.

The idea is simple: the plain old request-response is usually a good thing, if you keep your pages small, which you should (CSS and JavaScripts should be external, thus a nicely built pure HTML page should be around 10-20K, not more).

The plain old request-response keeps your url meaningful - the state of the page is reflected by the url, which is a very good practice to follow.

The plain old request-response keeps your Back button function as it should, without any unnecessary tricks.

The plain old request-response keeps all your content available for search engines, which is usually what you want.

The plain old request-response keeps your user using what he is used to. No surprises.

So - when should AJAX be used?

Auto-completion
For fetching partial long lists based on some initial input
Quick response to something insignificant, like showing the current results of a poll right after your vote is done, blocking you from re-voting (not that you cannot delete your cookie or session identifier and go back to the page to vote again, usually the vote is not registered per IP address)

Some additional insights on this topic can be found here:
http://webdesign.about.com/od/ajax/a/aa092506.htm

Sunday, July 20, 2008

Building features organically

I ran into this interesting post and wanted to share it with you:
http://marcelo.sampa.com/marcelo-calbucci/brave-tech-world/Building-features-organically.htm

This is how it begins:

One of the most interesting aspects of Agile development is to not build things
that you won't use now or over-engineer a component because "we might need this
in the future". That is a wonderful thing for developers and it does cut a lot
of complexity and time to deliver the feature, by consequence making it less
buggy.

Now, the rest is quite the same, so you can either go there or not.
But the first comment (and only one when I visitied) is pretty smart. And counting on you lazy fellows that you might not invest in following the link and scrolling down, here's what Mike comments:

I spec from the "top down" and build from the "bottom up". That way, I think I
have a more throught-out overall plan, and it eliminates implementing some
possible dead-end features. Specs are much more malleable and have far fewer
requirements from a quality point of view. So I find it worthwhile to be a bit
more expansive and "future thinking" in a spec. But when it comes down to
implementation, I try to do as you say - and build from the bottom up with the
minimum set of functionality that can be made to work consistently. You can then
decide when you've "fiddled" enough, and decide to ship the features to your
customers.

Well said.

Thursday, July 3, 2008

Thinking Procedural

From Object Oriented and Event Driven Back to Procedural Programming

Object Oriented Programming

Object Oriented Programming fits most of the applications we create today. The idea of methods or services, encapsulated into classes or components, with clear separation of concern, isolated from each other, is clear and obvious. This is the heart of the Object Oriented concept, of Component Based Development, of SOA, etc.

Event Driven Programming

Event Driven is another concept, raised mainly from User Interface applications, but also protocol stacks, state machines, message brokers and all kind of listeners in general, can be seen as based on event driven concepts. Event Driven does not necessarily relate to Object Oriented Architecture. But yet, it is not a simple procedural programming.

Other Non-Procedural

Other types of programming methods which are non-procedural in their nature may include Pattern Matching Programming (e.g. XSLT) and Declarative Programming (e.g. HTML, SQL; some view Object Oriented Programming as declarative, I disagree).

Thinking Procedural

All above - Object Oriented, Event Driven, Pattern Matching, Declarative Programming - keep us apart from the flow. We get used to ignore the flow, the entire end-to-end scenarios which eventually creates the system.

Long time since we created a pure procedural program. This made us stop thinking in a procedural way. Which is problematic, as eventually the program will run as a sequence and we have to make sure it will work properly.

Here come to rescue: Sequence Diagrams, Scenarios, XP User Stories, UI Storyboards and Test Driven Development. They were all invented to assist us solving the problem of forcing us back to "thinking procedural".

Monday, June 30, 2008

Bim Bam BOM

We had this strange bug recently, while trying to parse a perfectly healthy XML file we got an exception saying:

org.xml.sax.SAXParseException: Document root element is missing.

We could, however, open the XML file in the browser without an error. And also our XML editor took it without a problem. Yet, in our code trying to operate some XSL transformation using Xalan, we got the exception above. And the problem was not with the transformation itself, at least according to the exception. The problem is with the XML document starting without a root. Though the first character seen in the file was an opening triangle bracket...

Well, there are things beyond what you see.

To find out what is the exact stream of bytes that the transformer receives, and doesn't like, we added the simplest debug line, dumping the bytes from the file to screen, not as chars but in their value. There appeared to be two bytes before the opening triangle bracket of the XML: FF FE.

At this point, my friend and colleague Effie Nadiv (famous for his Hebrew site, and a UI authority and legend), shouted out: it's the BOM! Xalan doesn't recognize the BOM correctly!!

And without further ado he presented the file (same file) in text mode and in binary mode:

See the FF FE at the beginning? This is the BOM.

BOM stands for Byte-Order-Mark, added to UTF-16 documents to denote the order of the bytes in each two consecutive bytes creating a character. UTF-8 documents may also have BOM, but it will be redundant and have no mean.

To read more about Unicode, UTF-8, UTF-16 and BOM, you may want to go to:
http://unicode.org/faq/utf_bom.html
http://en.wikipedia.org/wiki/Byte_Order_Mark

But, before you rush to the above, a GREAT reading material to understand once and for all the entire encoding and charset thing:
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky.
This is a must read!

Specifically to solve the above problem we changed the doc to UTF-8 without BOM. Most text editors support conversion to different unicode transformation formats, and allow the user to decide whether to add BOM to UTF-8 or not (probably it's better not to add, just turn off the option underneath).

=========================================
Added 21/7/08:
------------------------------
Just found this old newsgroup entry on the subject...
http://biglist.com/lists/xsl-list/archives/200208/msg01302.html
=========================================

Thursday, June 26, 2008

Tag, Log, Debug!

I was presenting with my friend Alex Romanov our work on "Automated Log Generation and Analysis using Collaborative Tagging" at the IBM Programming Languages and Environments seminar 2008, yesterday (25/6).

This is Alex with the poster describing our work:

More info on this work can be found here: http://ed.finnerty.googlepages.com/taglogdebug.
Here you can find the initial paper.
And finally, a blog that will follow our work was opened here: http://taglogdebug.blogspot.com

Tuesday, June 24, 2008

Eclipse Plugins Tutorial (OOPSLA '07)

I was asked where can people get the materials from the tutorial on "Creating Plug-ins and Applications on Eclipse Platform" that Alex Romanov and I gave in Montreal on OOPSLA '07. It was while ago, and the materials are on the web already. But here are the links, as it seems google is not doing a good job referring people to the googlegroups site we created...

The tutorial deals with all the important things for getting started with Eclipse plug-ins: creating a plug-in project, the components of a plug-in project, GEF, MVC in the Eclipse Plug-in architecture, SWT, Actions, RCP and more.

So here is the companion booklet.
And here are the slides.
Both can be printed or redistributed, as is, for any purpose, referring back to the source.

Hope you can find it useful.

Wednesday, June 11, 2008

False Requirements

Suppose you are assigned to design a system that should carry out people from place to place, and occasionally let's say once in ten years, would have to travel a giraffe. It's tricky, but you may be able to come out with some strange car with a very high ceiling, working out the balance and stabilization. It may not be so economic, but who is going to be piker when it comes to carrying giraffes. (One can of course suggest a car without a ceiling at all, which might be a good solution, but one of the other requirements rejects a convertible.)

It turns out that we are often bending and twisting simple systems just to carry the giraffe. And in many cases when digging into, we find out the giraffe was not even in the formal requirements to begin with! It grew in somewhere, in one's imagination, and became an important part of the system. Oh, how much we could have saved without this giraffe, and the system could have been much simpler...

Let the giraffe travel on its own!

If it was in the original requirements, go back to the system analyst or the guy who wrote the requirements and ask him: do you really need this giraffe thing? maybe we can send him with another vehicle?

Giraffes are nice, but don't let them into your system.
Unless you want to run a zoo.

Friday, May 30, 2008

What would you like to know about your candidate?
(Or - how can you know that he is Smart and Gets things done?)

After my session at JavaDay I had a conversation with a colleague from Sun on technical interviews. We started with the question whether a Java candidate should be in the details of Generics (e.g. know the term "erasure" and what it means). I'll touch this in a moment but let's begin in an ordered manner.

What do you need to achieve in an interview?

The famous answer comes from Joel Spolky's "Smart & Gets Things Done" known article (it has an updated version, but I like better the original one. And it got out by now also as a book carrying the same title!) Sorry for the massive PR for Joel (he is NOT my cousin), but this is really a great source to begin with.

Following Joel, let's just mark out that you need to achieve two things in an interview:

Attract the candidate: sell yourself, your company and the offered position. Make sure that if you want this person you will get him.
-- This is highly important and sometimes neglected (e.g. making the candidate wait for his interview for 30 minutes), but we will keep this for another post.

Make a Hire / No Hire decision (is the candidate Smart and Gets things done?)

__________________

What do you need to know about your candidate?

Communication skills - can he explain himself well. Ask him about his previous job or big project at college, see if he can well explain what he did.
Does he see the big picture? - continuing with his big project, make him explain what other did in the same project, ask about other alternatives that were there to implement the same thing and why the one taken was selected.
Technologies - ask about the technologies used in his big project and what can he tell about each of them.
Design - you can learn about the candidate's design abilities by talking about the design of his big project, but it would be better here to tackle him with something he haven't thought about or at least didn't talk about in his previous interviews. It can be a pretty simple design, like some kind of LRU Cache, or Copy on Write mechanism, or anything else you have in mind. But make sure to listen - ask shortly, give enough time to think and then listen. This is a good opportunity to see how the guy thinks.
Implementation - if the candidate is going to write code, let him do it in the interview. You will be amazed how many fail at this stage. There is a debate whether the code phase should be on paper, on a PC but only with a Text Editor (and Help?), or with a full-blown IDE. Each of the alternatives may shed light on a different interesting angle. The super-candidate should excel in all of the environments, he may be using all of them in his day-to-day life simultaneously. What would we say about a candidate that gets the IDE but when lag-behind gives the excuse that he is not used to this IDE? I believe paper exam for short implementation is good, as long as you tell the candidate that it's OK if he makes small syntax mistakes. The code should be considered as pseud-code, but all important aspects should be included (e.g. resource management and exceptions). Then of course he has to explain what he did.
Reading other people code - many developers are really good in writing their own new code but do miserably when it comes to working with other people's code. During the implementation phase you should check this with your candidate, give him some strange API that he has to use, suggest your implementation to one of the methods and ask for his opinion (it might be a good thing to throw into your implementation some obvious bug).
Effectiveness, Tidiness, Laziness - do not ignore the tidiness you see during the interview. Usually this is what you will get later. Pressure is not an excuse for a mess. Watch the way the code is placed, either on the paper or in the IDE. Laziness is OK as long as it does not effect the essence. In fact laziness is something you want to find in your candidate. You want to see that he gets things done and for that he should not finish some of the assignment you gave him, but it should be the trivial part. A candidate that cannot live with an unfinished task in a way that prevents him from even doing the parts that he could have finish, is someone that does not get things done. Effectiveness means starting with the tricky stuff leaving the trivial for the end. Identifying what is tricky and what is trivial is being smart.
How does he fit the position? - this is the place, after presenting the position and the assignments he would have to handle, to ask the candidate how does he see himself in this position, what he likes and what he doesn't, what he is strong at and what he still has to learn. This is the place to see that the candidate is attached to reality. You saw him already, you got some feelings and impression about him. Does your impression fits what he says about himself?

----------------------------------
Addition, 12/6/2008:
----------------------------------
Asking for the candidate's grades, maybe even his SAT score (Psychometric exam called in some countries) - can also be a smart thing, assisting you to pinpoint the smart guys. It doesn't necessarily help, this is why you are having the interview, but it may shed some more light. And in my experience, even if there isn't a complete correlation with performance, high grades do say something and poor grades also have their say. Usually, not a surprise, a guy with high grades would do better than his fellow with poor grades. But do not let the grades confuse you, I saw some promising candidates with marvelous grades who may barely complete an if-else.
----------------------------------

How much time should it all take?

Including the first part of attracting the candidate (speaking about the position, the organization, and yourself) - not less than 90 minutes. 2 hours may be reasonable.

__________________

Is it a true / false question?

Joel says it is. There is only Hire / No Hire, nothing in between.

I agree, if we have the budget and attraction to get all the MacGyvers that we need, then yes. I do not mean that all employees should be superheros. They should, however, be superheros in their field. If we need someone to handle XML configuration files we need someone who is a master in this domain, whistles XPaths in his sleep and grains XMLs with XSLs daily. He will not find his duty boring because he will deal with making it more efficient by inventing new utilities. Same thing would be with a secretary, we will want someone who knows how to create table of content in documents and use sophisticated attributes of the spreadsheet.

But what if we cannot attract superheros in the domain? Suppose the budget we have for developers doesn't bring us the most brilliant ones and we have to compromise.

Building your compromises

The candidate should have a positive marginal output. Writing bugs gives negative output, writing bugs half of the time may also. When saying positive marginal output, it means that you have to take into account the effort and investment to be made in this candidate if recruited.
Marginal output is related to the current mix of your team. If you have "good thinkers" but lack "coders", head for those who can code fast, even if not considered design superheros.
Remember that there are qualities that can be gained (e.g. experience), while some other qualities are inherent. Prefer smart over experienced and capable over well trained. A smart novice can quickly become smart and experienced.

Now back to the question we begun with. Suppose we recruit a Java developer. Does the candidate need to know what is "earsure" for Java Generics?

Answer is simple: if you already have in your team someone who read the Generices literature and can serve as the team knowledge base, it's not the end of the world if the candidate does not know it. The team will have the knowledge. BUT - if you are now seeking for this "curious developer", make sure he proves his curiosity!

Summary

Hiring decisions are of the most important decisions made by managers. The damage that can be made by an unfit employee are huge. Investing time in an organized recruitment process and in each of the relevant candidates is a must. At the end, the employee selected must have a positive marginal output that justifies his salary. Marginal output is related to the current mix of your team, so recruitment decisions should be made with the set of people that you have in mind, and with understanding of the qualities that you miss in your team and want to focus on.

----------------------------------
Addition, 12/6/2008:
----------------------------------

Is Smart and Gets things done the all thing? What about motivated, doesn't get bored too quickly, having a good temper?
You may want to read these comments on Joel as well.

----------------------------------

Monday, May 26, 2008

Java Generics, Erasure, Reification

I gave today a talk in Sun's Java Day 2008, Israel, on Java Generics, Erasure, Subtyping, Super Type, Wildcards and a few words on Reification.

Though it's a bit of an old, Java 5, topic, and though the use of Generics is straight forward, the deeper understanding of why one gets type unsafe warnings, how to generify your own utility classes etc. - this needs some more attention. So I still find it relevant to talk about and it seems that it was relevant for most of the audience (I hope! ... well, in fact this is the feedback I got after the talk).

The presentation itself is available in pdf format here.

I want just to post here a small piece of code from the presentation. The main point of this code is how you can cast an object to T when T was actually erased and you don't have it in run-time.
The important lines are emphasized.

public <T> T getContent(Column<T> column) {
Object colContent;
String colName = column.getName();
if(colName != null) {
colContent = getContent(colName);
}
else {
Integer index = column.getIndex();
// TODO: if null throw something of your own
// (for now it will be NullPointerException)
colContent = getContent(index);
}
if(colContent == null) {
return null;
}

// now we need to cast the returned type and that's not trivial

// need raw type as a bypass
// usually use Class<?> when the type of the Class is unknown
// but it won't work here!
@SuppressWarnings("unchecked")
Class colClass = colContent.getClass();

// we know that we hold the correct type
@SuppressWarnings("unchecked")
Class<T> colClassT = colClass;

// now we can use the cast method of Class<T>
// If we were wrong regarding the type ClassCastException
// will be thrown (we can wrap with catch and replace with
// some application exception if necessary)
return colClassT.cast(colContent);
}

--------------------------------------
Addition, 27/05/2008:
--------------------------------------
A day after posting this, I got a very innocent question on the code: why not just use simple casting to T: return (T)colContent;
Well, I remembered that it's Not OK. But checking it again I recalled that Not OK means here that you get 'Type Safety: Unchecked Cast' warning, yet it does compile. What the compiler tells you is that it cannot check that the casting is fine at compile time, which is OK with us.
(Isn't it always the case that the compiler cannot check down casting at compile time? raises the question why Java decided to give a warning on this...).
So, a simpler version of the code would be of course:

// ...
// now we need to cast the returned type
// we prefer to use a local var so we can put
// the SuppressWarning here and not on the method
// (you cannot put it on a return statement)
@SuppressWarnings("unchecked")
T retval = (T)colContent;
return retval;
}

Now, one can just ask: so what's all this thing with the Class<T>.cast() method if we can directly use the "plain old" casting way of (T).
Well, in cases where Class<T>.cast() saves the 'type safety' warning, it worth using it. In our example above we get the warning anyhow (for the implementation using Class<T>.cast() we even get two), so in such a case it would be a better choice to go back to the "plain old" casting way of (T).
For an example where Class<T>.cast() indeed omits the 'type safety' warning, see item 29 in "Effective Java", details below.
Of course, if you do need the Class<T> instance itself, e.g. for reflection purposes, the original way presented above would be relevant.

------------------------------------------------------
End of Addition, 27/05/2008.
------------------------------------------------------

The idea behind the code is based on Josh Bloch's "Effective Java" 2nd Edition, item 29.

(The code example is not in the book, it's my example based on item 29. So in case of any error please find me responsible...)