On Software and Other Animals: 2010

Thursday, November 25, 2010

A simple generic template function

A simple generic template function for getting the minimum and maximum from STL container and simple arrays.

Nothing too complicated, I just liked this example.

1:  template<class Iterator>
2:  pair<Iterator, Iterator> minMaxFinder(Iterator begin, Iterator end)
3:  {
4:      Iterator min = begin;
5:      Iterator max = begin;
6:      for(++begin ; begin != end; ++begin) {
7:          if(*begin > *max) {
8:              max = begin;
9:          }
10:         else if(*begin < *min) {
11:             min = begin;
12:         }
13:      }
14:      return pair<Iterator, Iterator>(min, max);
15:  }
16:  int main()
17:  {
18:      int iArr[] = {15, 5, 70, 2};
19:      pair<int*, int*> minmax = minMaxFinder(&iArr[0], &iArr[4]);
20:      cout << *minmax.first << ", " << *minmax.second << endl;
21:      list<string> sList;
22:      sList.insert(sList.end(), "small");
23:      sList.insert(sList.end(), "smallish");
24:      sList.insert(sList.end(), "big");
25:      sList.insert(sList.end(), "biggish");
26:      pair<list<string>::iterator, list<string>::iterator> 
27:               minmaxS = minMaxFinder(sList.begin(), sList.end());
28:      cout << *minmaxS.first << ", " << *maxminS.second << endl;
29:      return 0;
30:  }

Code formatted with http://codeformatter.blogspot.com

Monday, November 22, 2010

Thoughts on Open Source licenses, Patents on Software and such

I'm dealing with Open Source usage approval cycle, which is an important task in any company, last thing that you want is to have your developers use whatever they find on the web.

Few insights and thoughts from my experience:

Developers in genral are mostly ignorant to legal issues. If not controlled they may use a free 30-days evaluation copy embedded in their system, just because the word free appeared somewhere in the site. In most cases they don't bother to read the license.
With Off-Shore developers problem is even bigger. They tend to be much more open with open source, without seeing the risks. Even if they do follow the company policy, submitting usage requests for open source usage, you may find inside their code much more "embedded" un-approved snippets and libraries. I tend to think that the reason is the distance, they believe that even if caught the maximum you could do is yell at them over the phone or in e-mails, but you cannot beat them physically and they use it.
For above reasons and others, usage of open source must be controlled. There are scanning tools in the market that help you find un-reported usage of open source and commercial external software. Usage of such is helpful in finding the disobedient developers who still drop in whatever they like, fix that on time and beat them while the felony is still hot.
Scanning tools also point at many usages that are a very small snippet of something that looks like might be taken from an open source or even from an un-licensed example on the web. To some it seem a problem that should be fixed in the code, I personally believe that the rights on how to perform quick sort do not belong to anybody, even if part of some open source or are published on the web somewhere. Taking two notes from a melody doesn't harm its rights.
Same goes for patents on software. Publishing a patent on algorithm is problematic, but many patents are on "a method and a system". I have such one myself. Does it really prevent anyone from creating a new similar development? Should it?

Monday, September 20, 2010

Resource Files

Programs interact with a user in many ways.
They write things to a screen, send messages over the network, say things aloud. And more.

In most cases, except maybe for debug logs, the string conveyed to the user shall be edited, and in some cases maybe even translated to other languages.

It's pitty to still see today modules that interact with a user, without using external resource files. Guys - how do you want someone to edit your lovely messages into something readable? and translate it into Swedish? or Yiddish?

The technical way of how to use resource file is a very old trick. The problem is that in the early days of a project, this is not the top prioirity (that's wrong, guys! the price at the beginning is very low!). The problem is that when it does become relevant, the code is full of strings in so many different places that it's a nightmare to do something.

Things to do on the first day when starting a new project (don't postpone it to "later"):
1. Use good logging mechanism (exiting one, don't invent the wheel)
2. Use automatic build mechanism
3. Use a resource file
( -- do you have more - please add as a comment!)

Well, wait I have one more - have a defined API (slash protocol slash wireframes) for the modules you are developing.

Thursday, August 19, 2010

Semi-colon and java.lang.OutOfMemoryError

; ; ; ; ; ; ; ; ; ; ; ; ; ;

I want to share with you a crash at customer site caused by java.lang.OutOfMemoryError.

Here is the original code:

if (synchRemove(lobj.getSeqNum()) != null);
____timeoutedList.add(lobj);

Can you see the problem?

(Well it's much easier after the relevant lines of code are isolated. In reality it took a few days and nights to get to these lines, remember it occurred in customer environment where not all relevant info is easily available for the development team. The OOM doesn't necessarily occur at this line).

Moral:

Small semicolon can cause big troubles
It's hard to see everything in code review. A trouble-making redundant semicolon can skip the eyes of the reviewer
Load test may find such cases (but may still miss them, if the relevant scenario was not created)
Good unit tests may also help
Most Coding Guidelines require curly brackets for any block, even containing only one line. This could possibly reveal the error (if not by the developer, by the reviewer in code review)
Static code analysis tools, like FindBugs – which is free, do point at such errors!
In some IDEs (e.g. Eclipse) you can configure the IDE to present warnings on such cases

in the above case, analyzing the heap dump, using MAT, led to the problematic giant list, then tracking all insertions into the list in the code, led to the faulty line.

Thursday, June 3, 2010

... and the UI is also going back

Following my last post on concurrent processes, my good friend Effie Nadiv noted that the UI is also getting backwards.

Back in the old days, Lotus 1-2-3 had a menu line that was not opening to overlay your main screen but was instead changing the menu line each time you make a menu selection. One can argue that this is not optimal, but I remember it as something very useful. Today in some cases you find yourself struggling to open the sub-sub-sub-menu to select something, trying not to close the entire thing before you make your selection.

It becomes also a common practice that any new dialog or window may hide previous ones, extra info may hide the main data etc. Effie argues that it all becomes from the philosophy that the application may do whatever it wants. Restrictions would heart creativity and we don't want that. So you have creativity in the new versions: my new version of Babylon has a fancy UI, ignoring the fact that it is not working, and the old one with the standard Windows UI was OK, it's a real charm. The new office is a nightmare, after you got used to something. Any new version of a SW tries to justify itself with a new fancy visual design which you don't want. You just want it to work.

Process Concurrency - are we stepping ahead or backward?

The first version of Apple's iPad allows only one running process. A bit limited, but most people reported high satisfaction. On the other hand, my OS allows for multi processes to run in parallel and it starts to turn me crazy.

Why process concurrency is a BAD THING

You are showing a presentation and suddenly some pop-up from another process pops
Doing a critical debugging task, you are all focused and sharpened, but then your machine starts being sluggish as some background process decides to take CPU time or memory
You open your task manager to understand why your machine is so sluggish and see a list of so many processes. What's all that? I don't need half of it. Though, the first "terminate" that you try shuts down the system.
Any utility today can decide to run in your background, to really control what runs in your background you need to be an expert, otherwise you lose your background to all sort of things you really don't know much about.

How often do you click Ctrl+ALT+Del?

I find myself at least once a day using the Ctrl+ALT+Del to check what's holding my machine. How about you?

So, how was it back then, in the good old times?

In DOS, you could have only one process running. If you wanted to jump to another without closing your application you could use TSR but you could have only one TSR pop at a certain time, and when it was at the background it took NO resources, it just set there waiting for an interrupt without making any sigh.

The un-responsiveness that we get today from our systems is unbearable. Applications decide to go at the background to update themselves, or to index the file system, or to do other non-critical task. The feeling is that the resources of the machine are not yours. You are lucky if the machine throws a bone at you giving you some resources.

The user experience you get today on a quad core, xGhz CPU with xGB fast RAM are worse than what you got on an old old x8086 machine, 1000 times weaker. The difference is that the old applications were much more focused on being lean and mean, never went to the net during their operation (the web was BBS and gopher), and they didn't run in parallel.

We need to go back to the old approach, giving the stage to one application at a time - all other would be HALTED and would not take any system resources. It sounds so limiting, but this is how DOS used to work. And this is how the first version of iPad works.

Monday, May 31, 2010

Java Unit Testing

There are a lot of tools and options out there for java unit testing, replacing JUnit or on-top of JUnit. Let's do some order in things.

1. TestNG vs. JUnit4

TestNG came out to add features that were missing in JUnit 3.x.
It did quite a good job and some may say that TestNG 5.10+ is still better than JUnit 4.7+
TestNG has better data providing mechanism.
TestNG has test dependency definition, missing in JUnit.
TestNG has group level with fixtures on group. And it also has the ability to create automatic run file for all failed tests.
All that said, JUnit has more extending libraries using its ability to extend JUnit TestCase and JUnit TestRunner.
Both have very good IDE support, as well as Ant and Maven support.

Coming to choose, both are good. I stick with JUnit.

Note that JUnit 4 made the mistake of integrating another library into its jar, that is the hamcrest core. They should have known better than that... The problem is that when you get the full hamcrest lib (to get the really powerful matchers that you need in your tests) it gets confused with the lib that came with JUnit. Solution is simple: put the full hamcrest that you bring to be first in the classpath, before the JUnit jar.

2. Load

JUnitPerf do the job with its ability to run tests in several threads.
It's easy and useful, for integration stage and for any piece of code that may act differently under load.

The following code was tested with JUnitPerf and it catches the bug of not synchronizing the addIfAbsent:

public class MyList extends ArrayList
// we test with the synchronized and without
public synchronized boolean addIfAbsent(Object o)
boolean absent = !super.contains(o);if(absent) {
super.add(o);
}
return absent;
}
}

The test that catches the bug when the synchronized is deleted:

public class MyListTest extends TestCase {
private MyList myList = new MyList();
private int count = 0;
MyListTest(String name) {
super(name);
}

TestSuite suite = new TestSuite();
Test testToRun = new MyListTest("testMyList");
int numUsers = 1000;
int iterations = 5;
suite.addTest(new LoadTest(testToRun, numUsers, iterations));
return suite;
}
public void testMyList() {
if(myList.addIfAbsent(count)) {
// there is a gap here that may cause error on correct
// implementation, in case of thread switch at this point
synchronized (this) {
++count;
assertEquals(myList.size(), count);
}
}
}
}

This solution is not bullet proof.
Theoretically it may give errors on correct scenario, and of course it can always miss bad implementation, as it’s a matter of timing.
However, in reality it does catch the problem!
JUnitPerf is simple and gets the job done.

3. Thread testing

It could be nice if we could have time the behavior of the code under test, to mimic race conditions and check how the code operates. The idea is to deliberately create race condition then to check that a synchronization lock is working as expected, ensure that unlocked blocks are fine, check for deadlocks and see if there data corruption.

With such ability we could have create the race condition in the above code with a deterministic approach instead of using load.

Suggested tools:

thread-weaver (version 0.1)
allows the test to time different threads reach code points in a certain order, either explicitly or “semi-automatically”
MultiThreadedTC (version 1.01)
allows to set time tickers on objects in the test itself, but NOT on the tested code, thus less relevant for most testing purposes

Both tools are not highly supported, recommending still not to rush into it…

4. Mock Objects

There are 3 cases where you need a Mock Object in your test:

(a) Tested code gets a complicated object as parameter
Solution: Mock the parameter with a Stub/Mock implementation

(b) Tested code invokes a static call on some resource
Solution: Replace the static call with a Stub/Mock implementation

(c) Tested code creates complicated objects
Solution: Replace the created objects with a Stub/Mock implementation

The Mock utilities rely on one of the following technologies: java.reflection.Proxy, replacing the ClassLoader, byte code instrumentation and AOP weaving (which relies by itself on replacing the ClassLoader or on byte code instrumentation).

Possible tools: Mockito, JMock, JMockit, EasyMock, JEasyTest and many others…

There are differences between the tools in syntax and abilities.
Tools that rely on Proxy cannot mock static behavior and object creation.

The two possibile combinations that came to my final round were:
(a) JMockit 0.998
(b) EasyMock 3.0 + PowerMock .1.3.8

JMockit seems powerful and well documented, but in one of my tests the test failed without a reason and when I debugged it I saw that some exception is thrown within the JMockit code itself. Probably I did something wrong in the test, but still this is not what I expect for. Maybe the 1.0 version would be better...

My selection here would be EasyMock + PowerMock

5. Other tools

Other tools in this domain worth mentioning:
- DBUnit
- HttpUnit (not only for tests)
- Selenium (it is a good tool for Web UI tests in general, and you can operate it from your java unit tests if you'd like)

Wednesday, April 14, 2010

Beware of your long tail garbage

IBM’s PLDE seminar 2010 (IBM Programming Languages and Development Environments Seminar 2010)

An interesting session by Kathy Barabash on GC running with parallel cores raises an important note I haven't thought about: long referenced tails are harder for the GC to parallelize, as the GC cannot break long list efficiently into two threads. Thus if you create references with long distance from the root (local or static reference), your GC time would be longer compared to same number of references in a more spread structure.
XALAN seem to sin with the above.

Gilad Bracha opened with a overview of mistakes done by Java 1.0 which are living in Java till these days.

Gili Nachum writes on the above two in his JavaTuning blog:
http://www.javatuning.com/ibms-plde-seminar-2010-review/

Wednesday, March 10, 2010

Cisco new router deliverring 322 Tbit/s

Cisco announced a new core router, named CSR-3, delivering 322 Tbit/s:
http://www.lightreading.com/document.asp?doc_id=188914&

My brother-in-law is part of this project. I'm not sure whether his part is the extra 22 Tera bits beyond 300, but he is there inside for the past years and has a major role there.

Whether this is a world shocking event for the internet world or not we will see, but for sure this is a landmark, comparing to the 56K b/sec not long ago (well, this is not an honest comparison, comparing home pace to core, but still the rate at the core level got from gigs to teras - factor of 1,000).

It's a very happy declaration for companies doing video services, IPTV etc. Of course there is still the need to propagate these paces to the homes, with fiber to the home, but this is already happening at some countries and will expand (Hong Kong, Hague, more Europe, and it happens that Kansas city even changed its name to get fiber).

With 322 Tbits/s it sounds that someone needs to really start working on teleporting.

Tuesday, February 16, 2010

Multiple Inheritance - should you?

It is often argued that multiple inheritance in C++ is not a good practice. This is why Java doesn't have this ability. Of course, multiple inheritance in C++ for something that is similar to interface implementation as in Java is reasonable, this is the case when all classes inherited from are pure abstract with no data members and no implementaions, only pure virtual methods, except one base which is the true parent.

However, in some cases you do see in C++ multiple inheritance that is not just interface implementation. And in some cases it seems reasonable. The iostream library is using it quite a lot.

I had a code review in which the developer used multiple inheritance. He said he was considering whether to use it or not and decided it serves his goal. Indeed it was OK, there was a class "SomeSortOfEventHandler" that was of both types "A_CertainEventHandler" and "AnotherKindOfEventHandler", that is, the "SomeSortOfEventHandler" was in fact handler for two kinds of events. In this case there was no need for virtual inheritance from the top parent ("AbstractEventHandler"), as the special "SomeSortOfEventHandler" need to have the data and behavior of both its parents.

So far so good. But at a certain point there was a need to implement a virtual method in two flavors, one for being the child of one parent and the other for being the child of the other. This becomes nasty. One can add to the virtual method a parameter of type T (there is a template class at the top, that differentiate the different families under "AbstractEventHandler") - but this is not so elegant as there is no real need for this parameter. All other alternatives were also breaking the elegancy of the inheritance tree. Conclusion: better to create two separate classes and hold a pointer from one to the other to get the relation between two objects, rather than use multiple inheritance.

On Software and Other Animals

Thursday, November 25, 2010

A simple generic template function

Monday, November 22, 2010

Thoughts on Open Source licenses, Patents on Software and such

Monday, September 20, 2010

Resource Files

Thursday, August 19, 2010

Semi-colon and java.lang.OutOfMemoryError

Thursday, June 3, 2010

... and the UI is also going back

Process Concurrency - are we stepping ahead or backward?

Monday, May 31, 2010

Java Unit Testing

Wednesday, April 14, 2010

Beware of your long tail garbage

Wednesday, March 10, 2010

Cisco new router deliverring 322 Tbit/s

Tuesday, February 16, 2010

Multiple Inheritance - should you?

Search This Blog

Popular Posts

Blog Archive

About This Blog

Thursday, November 25, 2010

Monday, November 22, 2010

Monday, September 20, 2010

Thursday, August 19, 2010

Thursday, June 3, 2010

Monday, May 31, 2010

Wednesday, April 14, 2010

Wednesday, March 10, 2010

Tuesday, February 16, 2010

Search This Blog

Popular Posts

From Object Oriented and Event Driven Back to Procedural Programming

Blog Archive

About This Blog