Thursday, 29 December 2011

There are two methods of security in IT that are not usually wise. The first is security by obscurity, in other words, if we don't tell people how we implement security, they are unlikely to break it. The problem with that view is that it isn't true! DVD encryption was broken, various sites have been cracked by trying various attack vectors. In reality, it is better to be open about your security if you can ensure you have a suitably helpful audience who can then advise on areas that might be weak. Oyster Cards and even Chip and Pin have exhibited weaknesses that could possibly have been found by peer review before they were implemented but both were 'closed source' and both have been compromised.
The second method of security is by permutations. My system is safe because you would need to try X combinations before you can crack it and X is a very large number. Of course, as we know, X is never a large number for more than a few years as newer faster computers and networks allow higher bandwidth for hacking attempts. In reality, your passwords should be 16 characters or more with punctuation, upper and lower case etc to provide any decent level of brute force defence but remember again that just because something is unlikely, doesn't mean it won't happen. You can mitigate however against brute force by slowing down the responses either progressively or even by a fixed amount of time (say 10 seconds) which thwarts faster hacking equipment. You can also block suspicious activity (such as 5 failed login attempts) either for a period of time like an hour or until 'manually' unlocked. In any case, you should be open with people who use your systems what the limits of your security are with statements such as, "WPS requires an 8 digit pin which would require X days of brute force to get past" which allows someone to consider this statement, perhaps challenge it and then allow the manufacturers to re-design or upgrade the equipment to increase the security.
Sadly, we rely on hidden decisions made by people who may or may not be competent in their job and whose decision may or may not still be valid and which may or may not even be visible, being lost in the mists of time.
Anyway, hire competent and trained people and ensure they are keeping up-to-date with current threats otherwise your company will look stupid!

The NHS electronic patient records system was like a lot of large IT projects going to deliver large benefits to patient care and why shouldn't it? Currently your GP and hospital records are mostly on paper or stored on independent systems which cannot communicate so you could be admitted to hospital and the doctors might not know that you visited your GP a month earlier with similar symptoms. However, also as with most large IT projects it ended up running late and being much over budget until the Government had little choice except to shelve it and cut their losses.
We use so much IT nowadays that we still haven't realised that we are mostly rubbish at running IT projects. In fact, we are pretty bad at running many projects but IT seems especially poor. What seems unreasonable to me is that the risks of IT projects are well known but yet they are not managed or removed to provide the ability to produce a useful albeit large IT system of some kind.
There are various problems and I have outlined a few below. We need to understand these problems, many related to human nature, and then decide what we will do about them.

Price must not be the main factor when choosing a supplier. While there is so much pressure on price, people are inclined to avoid factors that are seen as low benefit/cost such as quality and process improvements. When things are running a bit late, nothing other than pure functionality is going to be given any space. With a more open relationship with our customers, we can show what we are charging for what and therefore negotiate if unforeseen things happen. A customer will pay a 40% margin if they know anything about business, we shouldn't need to keep this a secret.

Time must not be unrealistically short. Customers always seem to want 6 months of work in 3 months and we are inclined to agree because we want the work. We perhaps think we can just about manage but of course we can't. People leave, you can't always recruit new people in time etc. Part of quality assurance should require that a supplier will only agree to timescales related to the design and planning and NOT by an arbitrary deadline imposed by the customer unless this is specifically factored in. There needs to be protection against unscrupulous suppliers who will say whatever the customer wants to hear in order to win the work. "Prove it or lose it"

Someone once said, "the customer is always right" but that is nonsense when applied to all business scenarios. Specifically the customers is sometimes wrong and sometimes very wrong about many things. This balance needs re-addressing and as experts, we need to improve our profile so a customer will trust us when we say something cannot be done or not in the timescale required. I don't argue with my doctor over medical issues so why is it OK for a customer to argue with an IT supplier over a system?

The more people you involve, the more complex it gets. In order to counter this, you either involve less people (which risks alienating your users) or you make the system simpler as far as possible so there is something for everyone to use. For instance, an electronic patient record system could be very complicated but even if it was a multi-page document like a filing card system, that would be of enormous benefit just to record stuff like "administered 100mg of XYZ". I was once asked to write a database system for a Bariatric clinic even though there was a European-wide database. The Ero one was so complicated, it was useless to any one person.

Requirements capture needs to be much better understood and only people of a relevant level of understanding in the customer organisation should be allowed to have formal input into the process. Asking every man and his dog for their input even though they might not understand that complexity is cost and that a system is more robust if it is coherent is clearly nonsense. Ideally, you should have a single point of contact with a customer who can have the arguments with various stakeholders on your behalf from a position of understanding. We have some large customers who have 20 people on the authorisation list for functional documents. Can you imagine how long they take for approval?

We need to understand and manage change better. If we are taking 2 or 3 years to deliver a system, its requirements will change over time. It is futile to pretend we will set requirements in stone. This was done for electronic MOTs and we ended up with dial-up connections required for all garages even though the internet and broadband was in full swing. We need to accept that change will happen and at least consider some things that might happen (even if their details are not fully known). We need to communicate the cost of change to a customer so they don't change things for the sake of it and we need to try and get a system out of the door sooner, even if not all the functionality exists, so at least we can achieve milestones and get paid rather than having a continuous moving goalpost. We should obviously be writing code that allows for change as well and this needs to be taken seriously. All those short-cuts and "quick hacks" should simply not be allowed at all.

I think there needs to be better scrutiny of suppliers who are getting things wrong and even potentially legal action against people who might pretend to be doing it all right but are covering up incompetence and deliveries of systems that are basically rubbish as soon as they are released. If someone releases something under a quality assurance banner that cannot be maintained to any reasonable degree then they should have to pay some of the costs of re-designing it.

Quality needs to underpin the processes at every level and sadly this is still an after-thought or box-ticking exercise for many organisations. If they could understand what quality really means. The systems we deliver would be so much better.

Wednesday, 28 December 2011

For some reason, email hasn't changed fundamentally since it's original inception back in the computer dark ages of un-powerful PCs running over slow networks. It is basically a text-based protocol with no security whatsoever. You can spoof who the email is from and read any of them with a network sniffing tool. Everything since then has been tagged on the top like PGP for security, IMAP for mail servers, Exchange Server (whatever that adds). Basically I don't like email because it has few useful features and it is extremely over-used for a variety of tasks that it is definitely not suitable for.
In terms of functionality, I would like a feature to be added to email which is simply an "expiry" date. Many emails that are sent are only relevant for a period of time (long or short) and the ability to allow the email systems to delete expired mail would clear logs of unnecessary backlogs. For instance, using an email to tell you that a backup succeeded or failed is mostly irrelevant after the next backup occurs so if you get 10 of these over the weekend, probably 9 of them can be discarded. However, I do not have the influence to make that happen but I do have an influence over how I use email and require others to use it.
My old manager had something like 5000 emails in his inbox. He was an extreme example of badly used email. What is the point of having that many when you can only handle a few at a time? Here are some suggestions as to why he had too many and what you can do to avoid the same mistake:

Self-importance! If you think you are too important, you want to be copied in on everything. This betrays major issues either with trust or just how badly the organisation is run.

Incorrect use of email. Having emails for things like successful backups is perhaps acceptable if you only have one server and it backs up once per day but really, there are proper tools that monitor these things and which don't need to spit out tonnes of emails into your inbox. A monitoring system that has to send emails constantly to convince you it is running is not really much good!

Incorrect use of email too. Lots of people claim they write emails to leave a 'paper trail' of decision making to cover their backs later on. This is tosh. Whatever you have written in an email is likely to be mis-construed which is a defence later on when someone decides that you didn't tell them to do X in the email. Pick up the phone if that's what is required. You can communicate 10 times more efficiently with your voice (and even more in person) than you ever can on email. Why take 10 minutes to try and formulate what you could describe and discuss in 30 seconds? If someone doesn't understand, they can ask you straight away rather than writing you a 10 minute reply!

Make correct use of importance and expiry. Although expiry is not part of the email standard per-se some providers give you an expiry option which allows you to mark an email like "you have left your headlights on" to expire after 4 hours and not to be looked at by everyone who has been on holiday for 4 weeks.

Use correct subjects. Your subjects should be very concise, especially on emails going to more than one person since some of those people might very much not be interested. An email with a subject like "Does anyone have a set of jump leads" is more useful than one with a subject of "request" and a body that contains the question.

Consider alternatives. Sometimes you could write your own application to do something that you currently use email for (such as monitoring backups, lift sharing or general enquiries). You could use a separate system (like Yammer) to allow people to ask non-work related questions to keep your inbox down. Also, an email like "Potatoes are ready" being sent every day at lunchtime can be annoying. How about telling people that the potatoes will always be ready at 12 unless there is an email to the contrary?

I heard about the head of a car company who had banned his employees from using email for the fact it is really inefficient. Will be interesting to see whether they can break the email mindset!

Thursday, 22 December 2011

Engineers are often (rightly) accused of being pedantic. We insist that we use an integer rather than a short or that we should or should not use a pattern. The sales type people don't understand this and will tell us that as long as the customer sees a system that does what they want, they couldn't give two hoots (or 2 dollars) for how it is implemented. This might seem like a pragmatic approach since in a sense that is true but sadly there are a few assumptions in that statement that need to be exposed and which reinforce the need for strict coding practices.
Firstly, the life of software is usually much longer than it is assumed. Look at Windows XP for instance, when it was first released in 2001, MS would have a plan that in 5 years, it would be pretty much replaced by whatever their next OS was going to be (Vista or Windows 7 perhaps). In other words, don't get crazy with it, if a bug will survive 5 years, we can leave it in and then fix it later. In fact, XP still holds around 33% of internet connected machines 11 years later. It is still being used and the bugs that might have laid unlikely for 5 years are much more likely to surface over 11 years. In a similar way, do we really design software for 10 or 20 years? If we designed a car to survive that long, we would have to be very careful about all components yet in software we can still be quite glib about the level of quality that is acceptable.
Secondly, it is quite glib to say something like, "as long as the customer gets what he wants", both because over time this will prove less true and also because it is not as black-and-white as that. MS Word pretty much OK but there are things I don't like about it and I live with them - it is not either perfect or useless. The level to which we can achieve this 'perfection' however is important in people trusting our software. Also, this acceptance is based on functionality as much as reliability and there is every chance after a while that a customer will want to change something, a good test of how well the software was originally written.
Another reason why we should strive for quality is to ease maintenance. We have staff turnover and we have to change and fix things. There is nothing worse than being presented with an obvious defect but being totally overwhelmed by the code we have to look through to find the cause of the defect. This is why we should keep our objects classified specifically and deliberately. Data objects contain properties to describe their data and little else. Control objects provide methods to do stuff and should not generally be persisted. Interface objects translate between specific systems and can be added or removed as the boundaries change.
We have probably all been in the situation where we notice an issue in code and are told it is so low a priority that we should not fix it (partly this is because of the build time overhead - something we should consider) but as with rust, what starts with a small bit can grow and grow until a full rewrite or re-factor is required. We must be careful that we don't break code that might be 'wrong' but works because something else takes account of its wrongness but at the same time, we should feel at ease making code more readable and obvious. Of course, the best way to do this is to get the code correct in the first place so it doesn't need to be changed and this again requires company buy-in because it involves training and reviews. It requires keen eyes that can spot where mistakes are creeping in and finding ways to mitigate the risks. Even subtle coding styles can introduce errors so sometimes we might require very high-level training about separation of concerns, correct use of comments and ensuring that we are using tools that make this easy to do so the time required doesn't become unacceptable.
We need to treat our software with more respect and we need processes and people who understand that so that what we produce is still running as expected in 10 or 20 years time, hopefully with a lot of repeat work!

Monday, 19 December 2011

I think I am probably of the opinion of many others that protecting your own ideas is fair enough (although I could be convinced that people should help each other) but let us say for instance that someone spends millions developing a working fusion reactor, it would be unfair if everyone just copied that idea and capitalised on someone else's investment but sadly for most software patents, they involve, in my opinion, virtually no 'intellectual property' but which are basically money making schemes that make lawyers rich and everyone else poorer. Let us start with something like MP3. Presumably this took time and effort to develop and is a protected technology, which although you might disagree with the marketing model, few people would argue with. Let us however look at some other patents like 'pinch to resize' and 'using clever routing to improve connection speeds'. Firstly, these require no intellectual prowess to conceive. In the same vein, I could conceive of flying cars and teleporters but without developing these I should not have the right to the intellectual property for these just because I can run fastest to the patent office. More significantly however, the idea itself does not provide a competitor with much. They still have to code it/implement it and this surely is where the cleverness comes in.
It seems that most of these patent cases appear in the USA, the most materialistic and capitalistic country in the world? A country that panders to the dollar more than to the community or to progress in general. Even Google who are famously open and generous with their offerings seem to get stuck in this area since the old-fashioned companies attack them for sharing this amazing knowledge with the world at large rather than selling it for the money that only marketers dream of.
Shame on you MS, BT, Apple, Google etc. Get over yourselves and start being proactive rather than reactive. Make stuff that people want to buy and don't pretend that you are unfairly losing market because someone made something that was the same colour as yours!

Monday, 12 December 2011

I need suggestions for a new word. In software engineering, we get taught about things that are important like code clarity and maintenance but when we start work, we soon realise that only functionality that can be seen by the customer has any real worth. Things that have been done messy is resigned to being, "yeah, we'll fix that later" and we all know it won't. We all know that companies do not have capacity for people to tidy up (and potentially break) existing code.
It's a bit like functionality is the bourgeois and the clarity and neatness is simply the illegitimate or maybe the working class of software engineering. Yeah, we care, just not that much. The sad fact is that this needs to be done correctly at the start or it never will be, which is sad because it causes so much pain and lack of quality issues further down the line.
What word for this sadly neglected area of software engineering? For this butler/chauffeur of the software community? For this important but under-valued foundation?
I reckon something like UnderCode?

Friday, 9 December 2011

Spot it? I didn't! The obvious intention is to create a date which is the 1st of the next month, in general it works as expected but when the month is December, month + 1 = 13 which will understandably cause the constructor to throw an exception. The compiler won't catch this because the types are correct.
What point am I making? Well, quality is about learning from mistakes. What can we learn from the above code? Obviously we could simply attribute it to human error and say a code review might pick it up (but possibly/probably not) but what else can we decide? Well, one thing about strongly typed languages is that we should aim to get the compiler to test as much as possible on our behalf to catch these sorts of errors so the real issue is that there should NOT be a DateTime constructor that takes 3 integers for arguments. It sounds reasonable on paper but it means that for 99% of combinations of parameters, the constructor will throw an exception. What we should have instead is a constructor that takes enumerated values and then we could make use of the AddMonths method (which we could have used in the above code) to get something more like this:

Which would give a much more solid representation of what we are doing and would make it harder to fudge. That way, we would have to cast a number to an enum if we wanted to hack the code and this would be much more obvious in a code review:

Thursday, 8 December 2011

When I first started using mocks, I must admit finding them slightly confusing. I knew what they were supposed to do but some of the language was a bit confusing so then I didn't know if I actually understood them at all. I wanted to write a very basic introduction to Mocks to get you in the mood for using them.
Firstly, a mock object (and I will be referring to Rhino mocks) is designed to act in place of an actual piece of software whether a service or even one of your own sub-systems. This is good for two reasons. Firstly, you might not have access to the real-life object (perhaps a web service that you cannot call from your test environment) but secondly, they allow you to test only a part of your system without risking the introduction of defects from other parts of the system (which might be outside of your control or not finished yet). For example, suppose you are writing a user interface to a banking system and you want to test your user interface. If you connect it to the actual back-end, you might see loads of errors that are nothing to do with your code and which make it hard to know whether your code is correct or not. By "mocking" the back-end, you can tell the mock how to behave and therefore test whether your code does what it says without worrying about someone else's defects.
To create a mock object you implement whatever interface you want to mock and then it is usual to encapsulate a generic mock property like so:

If your mock is for a service, you should also implement IDisposable and setup your end point replacement for web config files. The mock could use any of the service bindings, in this case I use a named pipe:

You would then need to implement any members of your interface and in most cases will simply forward the call onto the mock object

public void IInterfaceIWantToMock.SomeMethod()
{
mock.SomeMethod();
}

Now the bit that can seem quite tricky, the "expectations". The reason it is tricky is because although it is called an expectation, you do not have to "require" the call to be made and you can either be very specific about expectations or very general. It is better to be specific but in some cases this might mean lots of extra code. You can also specify things like how many times to expect the call to be made and what to return when the method is called in this way. It is probably neater to specify expectations inside the Mock class itself but you could expose the mock in a property and set them up elsewhere (just a choice of how tidy you want it). Let us specify a very simple expectation on our mock:

When this method is called, it tells the mock that if someone calls the mock method SomeMethod, you should allow it once but if it was called a second time, the mock would throw a null reference exception. You can specify a specific number of times or even Any() but don't be lazy with Any(), hopefully you will know how many times the method should be called. We will now look at a method which takes an argument and returns something to see what these look like. Bear in mind that lambda expressions are used extensively to make the syntax neater.

If we dissect this, we need to remember that this method is NOT what is going to be called during our testing, that would be the method called SomeOtherMethod(MethodRequest req). This is merely saying that when we do call that method with the id given in the call to the expect method, to return a response which includes a flag called success. This time we have said this will be called twice. We have specified that the arguments to the mock are important by matching id with the subsequent call to SomeOtherMethod() (we could setup the expectation to not care about some or all of the arguments but again that is in danger of being lazy). One of the things that this implies is that we could call this expectation method more than once with different Guids and the mock would then expect SomeOtherMethod() to be called twice for each different Guid passed in. This allows us to be very specific about what we expect and what we don't expect. Suppose for instance we are dealing with 2 different Guids in a test, we want to make sure the correct one is passed to a certain mock method, therefore we ensure that the call to the mock matches the parameter to one we told it to expect (like we did about in ExpectSomeOtherMethod()).
There are of course two sides to this story, one is that we want to tell the mock what we expect to pass to the mock but also we must specify what we would then expect the mock to return. This can cause us a risk because we might have to assume what the real-life object will return and we might get it wrong. Therefore, ideally, we would have input from the people who are writing the real object to ensure our mock expectations (specifically what they return for our arguments) are correct.
The consumer of the mock then has to create the mock object, setup expectations, optionally verify that all expectations were met and then dispose the mock if required. If using something like NUnit, it is easiest to setup, reset and dispose the mocks in the [Setup], [TearDown], [TestFixtureSetup] and [TestFixtureTearDown] methods to ensure they are always reset correctly (otherwise you might have a previous expectation hanging over from a previous test). An example of doing it all in a single method is shown below:

The assumption in this test is that DoSomethingWhichWillCallOntoTheMock() will do whatever it does including calling onto our mocked service. From our examples, it will call SomeMethod() once and SomeOtherMethod() twice with the given id. The call to VerifyAllExpectations is optional and when called will ensure that every single expectation was received. This would fail if you only called SomeOtherMethod() once for example or if you called it with the wrong arguments and it therefore hit a different expectation.
Hopefully you get the idea now...

Tuesday, 6 December 2011

Exceptions are a strange beast. They are actually worded correctly and are supposed to relate to things that shouldn't happen such as malformed values being parsed and calling services with incorrect data. However, the line has been blurred between what is genuinely exceptional and what is merely incorrect. For example, if you have a service that is called by a web application and a certain method on that service takes a string which represents an integer, it would be reasonable to throw an exception if the string is null or not parsable as an integer. It would be part of the contract for that method (and could be enforced other ways as it happens) but represents a genuine "exception", a problem in application logic or design. Consider a similar example where a user inputs an integer into a text field on a web app and submits it to the code behind. In this case, an exception should NOT be used since this is not exceptional at all but completely expected (people type things in wrongly all the time).
Why not use exceptions in this case? Well they seem pretty useful so I guess we could but there are two reasons why we shouldn't. Firstly it confuses our minds as to the correct use of exceptions, we start blurring the line between logic/business/coding errors and simple error handling. The second reason is performance. When an exception is thrown, the framework has to load up a debugging engine which loads the call stack data and which takes a significant amount of resource. This resource is closed once the exception is handled and then opened again if another exception occurs. Why? Because it is designed to be for exceptional conditions and not for normal logic flow. As an example, I ran a test app to compare Int.Parse with Int.TryParse while parsing a bad string. I caught the exception from Parse locally and ignored it. It took 4.24 seconds to run 100,000 exceptions and only 2mS to run 100,000 Tries (2000 times slower to allow exceptions to be part of normal logic).
That last line gives a clue to how we might avoid exceptions. We can use alternative methods and operations. It should only be at boundaries that we need to check formats (user interface and interfaces with other systems) anything internally to our application should behave. If we are therefore testing text fields for correctness then we can either use the TryParse family of methods or use a RegEx to specify exactly what we want something to look like. That way we can call IsMatch as a boolean check rather than allow a parsing operation to throw. We can similarly check other fields and types using simple if ( ... ) return false or if ( ... ) DisplayError() bearing in mind that our checks are only useful if we can inform the user or a system admin that something has been entered correctly.
As an acid test, you should generally never see an exception caught and ignored. Catching, logging and then continuing might be valid (as long as you have a way of fixing the underlying problem) but if you ignore the exception, you have made it unexceptional.

Tuesday, 29 November 2011

Another of MSs suitably abstract errors which don't really tell you what is going on. They have gone to the effort of catching a problem so why not describe it better?
Anyway, I got this while calling onto a Rhino mock class which I had supposedly setup an expectation for. It was unusual since my understanding would be if the expectation matched, I would get a response and if it didn't then I would get null back and it would fall over elsewhere. There is obviously a middle ground.

A fairly simple use of the expectation. The problem appeared to be the fact that one of the request properties was a class, not just a number or string etc. and in the code I was using, effectively, I was using this for the expectation:

req = new DetectRequest { Id = Id, Stage = new ClassA() };

but calling this on the mock:

req = new DetectRequest { Id = Id, Stage = new ClassB() };

The different class was intentional and I would have expected if these didn't match, the AllPropertiesMatch() would fail and therefore the mock would return null. In fact, it generates the above error, "object does not match target type". All I had to do was to set-up the expectation with the same class I was passing into the actual call and it was fine.

Wednesday, 23 November 2011

The problem with any reasonably secure activation system is a) it will cause a lot of problems for certain people/hardware and b) it will probably be hackable anyway. I had just endured the pain of losing my linux files due to the terrible way in which windows installations simply overwrite the master boot record ignoring any existing systems in my case actually overwriting it with a new type of boot record (GPT) which was incompatible with Windows XP 32 which then couldn't install. I thought my pain was over, all I needed to do was boot XP, install any missing drivers and be done with it. Oh no.
Activation. I have a legit copy of XP (although I had lost the key) so I expected to at least be able to get 30 days to find my key so I could activate it. In order to get past the installation, I used some key I found on the internet (expecting it to be blacklisted for activation).
I booted up to the login screen and was immediately told I had to activate before I could log in and surely enough, my only options were to activate or it would log me back off. This didn't seem right, I had never come across this before.
Surprise, surprise, there is a bug in the way activation works. The simple cause was that I had no network installed because neither of my drivers (wired and wireless) were installed by XP. Firstly, this causes activation to get confused and think you have done something bad which requires immediate activation (which is understandable if it worked) but the major problem is then you cannot activate over the network (no big deal) but if you click "telephone", it displays a page with your hardware id which is - BLANK. Why? Because part of this key comes from the network card which is not installed yet. I called Microsoft but they were mostly useless and told me to ring tech support.
What I had to do was reboot into safe mode (press F8 when starting up) but NOT safe mode with networking, that won't work. In safe mode, I had to get the network drivers (via CD because the network wasn't working) and attempt to install them along with the video drivers. The wi-fi installer wouldn't work in safe mode but the lan looked like it did. I also tried this hack to disable the activation check by changing a registry key.
Rebooted into normal and it still said I needed to activate but when I pressed "Activate Now", it told me I was already activated (presumably because of what I changed in the registry) so I changed it back. Rebooted again and this time it told me to activate and brought up the screen to activate. This time, the lan was enabled so I brought up the internet activation window and it told me my key was invalid (as I expected). Since I couldn't find my key, I found a key gen on t'internet and it generated me a key that worked! I then had the joy of installing service pack 1 and then 3 although after 3 was installed, the screen started working properly and the whole thing seemed better. I then obtained and installed the remainder of the drivers from the Samsung site (they are hard to find!) so that there were no more little yellow icons in device manager.
Now all I have to do is reinstall Ubuntu which will take MUCH less time (and no activation).

I have spent about 10 hours of time trying to install Windows XP as a dual boot on my Linux laptop and all the problems I had and the fact I lost my Linux partitions are all related to Microsoft's selfish installer that overwrites the master boot record rather than installing alongside whatever is already there but it is worse than that.
I have two Windows XP CDs a full 64 bit one and an upgrade 32 bit one and the reason I am installing it, although I am more than happy with Ubuntu is because, as you might expect, there is a piece of software I want to use that only runs on Windows and Mac.
Simple, I thought, I've done it before. You partition your disk from the Linux boot CD using GParted and then run the windows installer which overwrites the MBR. You then re-run the grub install from the bLinux boot CD and you're back in business. Except I wasn't.
The problem I had related to something I didn't know existed. Guid Partition Table (GPT) which is a more modern and better use of the master boot record that doesn't require a limit on primary partitions and the funny use of extended and logical partitions - fair enough. What I didn't realise was that Windows XP 64 bit (and presumably other new Windows) installs this without telling you that it is totally incompatible with older systems like Windows XP 32. This wouldn't have even been noticed since grub can handle GPT except for one problem. I installed Windows XP 64 bit and most of the drivers for my hardware weren't there (or even on the net, they were probably never written). So I had got a working dual boot system with XP 64 but which was mostly useless.
No problem, I thought, I'll just run the XP 32 bit installation and choose the XP 64 bit partition as the destination and it can overwrite everything. Right? Nope. Run up the XP installation CD and all it can see is 131Gb of an unknown partition type (it's 160Gb disk but more on that later). I was confused. What was happening? Ah. Maybe I need to re-run grub to reset the MBR before the installation will read it correctly. Nope. Still the same. I couldn't work out what was happening, even from Google so the helpful people at ubuntuforums suggested the problem was related to my new GPT boot record which XP 32 didn't understand. They were right so now I was a bit stuffed, how can I overwrite the GPT with MBR so that XP 32 will understand?
This is where I cocked up a bit. I ran up GParted again and found out there was an option to re-write the boot record (with the MBR format) which I chose. It warned about data loss but I presumed it meant there was a risk. What actually happens is that it wipes all the current partition information and then writes the MBR. I tried disktool to get them back but somewhere, they must have experienced a quick format and lost the data. I wasn't too bothered. I have a backup of most of my stuff and use Google quite heavily, at least.
So I then spent the next hours trying various combinations of GParted and diskpart (the windows equivalent) as well as fdisk to try and get XP 32 to read the 160Gb disk. It was like nothing made any difference. Was I actually resetting the MBR at all?
This was when I learned something else. Various combinations of HDD, 32 bit OS and BIOS settings means that the installation would only be able to 'see' the first 131Gb of the disk. What I ended up doing was what is suggested for a clean dual boot install. Do Windows XP first. I removed all partitions, created a 50Gb partition for Windows to use and then ran the installation. Did the pain end there? Nope, see my next post.

Tuesday, 15 November 2011

Perhaps it is Internationalization? We live in a global market now with a web site in Nepal appearing in the Google results of Brazil and one of the knock-on effects of this is to make sure your web sites are foreign speaker friendly.
Unfortunately, this whole area is very complex because languages are complex, alphabets vary and you might have a Chinese person living in an English speaking country - all of this adds up to a headache which for most people is simply avoided. It needn't kill your brain however, I thought I would guide you through some of the pitfalls and give you some pointers to how to make your web sites work for more than just English people.

Use a pre-built application framework where possible, one that already includes functionality for multi-locale functionality. It is much easier if you use something pre-built even if it is slightly different than you might have designed it.

Learn what the words mean. A locale is a means of formatting dates and times. It is generally related to a single country or groups of neighbouring countries and to a single language. For instance, you can get a locale for Spanish (Spain) and a different locale for Spanish (Mexico).

A language is a complex idea since some languages are really just dialects of each other. For the most part we are concerned about languages that have different written forms rather than spoken forms since our websites are predominantly written down.

For practical reasons, if we make our site multi-lingual, we should restrict the languages we support to a minimum. For some of us, we might be happy with English, Spanish, Portuguese, Russian and Chinese as covering most of the globe. For others we might have Mandarin Chinese and Yue Chinese (Cantonese). Of course, if we are a Chinese site, we might have 12 Chinese languages. We would probably also decide that Scots English is close enough to English English to not need a separate translation (although again in some specialist areas this might be required).

It is easier to de-normalise databases for language work and make each row a single language selection. This row might have a country name, locale, language name (possibly translated into the other language e.g. Deutsch instead of German). Some of these values might be duplicated but it makes for easier administration.

If you want your site to be highly multi-lingual, consider adding a system whereby volunteers can translate the words for you rather than you having to pay people to do it.

Use as few words as possible on your site and use pictures where these are clearer. A picture of an exit door saves you umpteen translations of the word "Exit".

Try and auto-detect the language from the browser post variables for convenience.

It should be easy to get to the language you want. If you have, for instance, logged onto a site in China but are an English speaker, your site might have detected Chinese as the browser language and left you staring at something you cannot possibly understand. By using something like a flag in the top corner of all pages and translating the language names into those languages, it should always be easy to find the language you want.

Try and monitor the translation status of languages on your site and if you feel they are not adequately finished, include a mechanism to disable the language. It would look pretty lame if your site was only 10% Tajiki when switched to that language.

Default all messages to English if there is no suitable translation (don't put blank strings there instead!)

Do not re-invent the wheel. Even if you are building your site from scratch (which you hopefully are not!) the data for countries and locales is all out there for free, do not type in the names of 6400 languages.

Friday, 11 November 2011

Lots of people ask this in forums and there are various complex answers that come back - enough to make the most hardened GNU cohorts shiver. Anyway, dead easy, just use grep from the command line.

grep -r 'search term in here' *

-r is to recurse sub directories and the * says look in everything. The result is written to the console but you could always redirect it to a file using "> something" at the end (without the quotes of course).

This is a common error that doesn't seem to make sense. How can you not cast an enum from a type to itself? Makes more sense when we realise that types are often shared by copying libraries between projects. We might create a service contract version 1 and then add an enum entry and create version 2. We update the sender with version 2 but not the receiver and then we get the error listed above.
Trying to get another piece of code working was taking much longer than it should have and I thought that the change was pretty easy. Unfortunately, the fix I was making was in a shared library which then had to be copied into another project which would run the code (and fail!) and without any easy way to get the debugger to work without project changes. What I realised was that there were various combinations of things I was trying but I wasn't recording them. Use update panel or not? use declarative event handler or code one? Deploy to deployment folder or bin folder? Am I testing the code that I've changed or not?
All I needed to do was actually write down the combinations I had tried and then be careful to point the web site I was testing to the correct place (we effectively have two web sites in the project). Once I did this, I realised that I had missed a combination, one that worked, and that the code was correct, I just kept breaking one thing as soon as I fixed something else.
Be methodical people!

Thursday, 10 November 2011

Classic problem here. Visual Studio web site project that includes various projects and a number of MS and third party dlls that are referenced. A specific project was referencing Microsoft.Practices.Unity version 2.0 whereas when we built the project, we got the error "Thirdpartylibrary.dll references Microsoft.Practices.Unity version 2.0... which is higher than the referenced version 1.2".
Checking into it and the only version we reference is version 2.0 and I couldn't see where the version 1.2 was possibly copied from (into the website bin directory). Eventually I resorted to modifying a Powershell script which listed every dll in our project that was referencing Unity and looked for one which was version 1.2. It turned out to be EnterpriseLibrary.Logging, something we no longer used (and something which wouldn't obviously reference unity). Removed all references to that and it was fine.

Thursday, 3 November 2011

It's annoying when you see an error that lots of people report but you can't find out any useful answers! I had this today when calling a Mock via a named pipe. It was a case of having changed so little and not seeing what might have caused something that looked so serious.
Anyway, after following advice from another post and enabling WCF logging in the app config for the unit tests, the log DOES include the error and its in red so nice and easy to find. The problem? I was returning an object from my mock expectation which had about 20 properties, only 5 of which I wanted to consume and therefore which I set. It turned out that one of the properties I hadn't set was an enum that defaulted to 0 although 0 was not a valid enum entry so the serializer was throwing the error.

Monday, 31 October 2011

I was testing a windows installer change on a box and found that uninstalling did not remove all the files, even if I manually deleted the folder and installed from clean. In reality there is no such thing as a clean box! I had assumed that there was no reason this service had been installed on the box before so how could it all have been screwed up? I searched for various guids in the registry and didn't find them until I had the genius idea of searching for the filenames that weren't being uninstalled (some were uninstalled but most weren't). I found them under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Installer\UserData where the paths are linked to product ids so that files that are shared are not uninstalled if only one of the products is uninstalled. I manually removed all of these that pointed to my files (don't bother with MSs windows installer cleanup - does bugger all useful stuff) and worked out it was all fine and then had to go and look at what had happened.

Firstly, I had made an assumption that this box was clean but actually it had been cloned from another virtual machine and had remnants of a failed install/uninstall of this service in the registry. What I thought was a problem with my changes was a registry problem. Secondly, we need to understand how this can come about. Well, Windows installer adds the stuff correctly to the registry (I assume this is hard to muck up!) but there are two ways in which this can be corrupted. Firstly you can have problems with the uninstall process which leaves stuff in the registry or more commonly, people manually delete program folders leaving orphaned registry keys.

Monday, 24 October 2011

This is an option in Visual Studio database projects under the sqldeployment file. The intention is simple enough. If there is a danger that a database might be accessed by other entities while deploying, you tick this box and after the database is created, it is set to single user mode while the script carries on adding schema objects etc. For some reason, at the end, it is not put back into multi-user mode - which I guess would have to be done manually.
Anyway, the problem I had was that I upgraded a database project that had been built in Visual Studio 2008 to Visual Studio 2010. I did this so I could target the SQL Server 2008 schema (compatibility level 100). There was a problem after deployment in that the database I had deployed (not surprisingly) was in single-user mode which meant it didn't work on our web application. I was confused because I had not changed anything in the database projects, just rebuilt them in 2010. It was also confusing because only 1 database was affected. The second point was simply answered by the fact that only 1 database had this option ticked but why had the problem only just surfaced.
When I checked the database.sql that had been deployed from the project, it was only after upgrading to 2010 that the single user mode had been added to the sql. In other words, it appears that a bug in VS2008 was ignoring this tick box and this had been fixed in VS 2010.
Great stuff eh? I can't find any other specific hits in Google for this so maybe no-one noticed (like we didn't).

Friday, 23 September 2011

Those of us who used C and C++ were warned that global variables are not good. We mostly agreed but would sneak things in if we could! C# might not have globals per-se but actually it has static variables which can exhibit several similar problems. Consider the following method in an asp web app:

All pretty boilerplate stuff. The session is used in this case to pass the exception detail into the error page to display if required.
In a live system, this method was crashing but we did not know it was crashing, all we knew was the system was intermittently crawling to a halt. I was looking into this method because the event log messages were NOT being logged even though this was the event handler.
The problem, as you might have guessed was caused by a 'global'. What was happening was that underneath the call to WriteEvent, a call was made to obtain the 'global' Context to add certain information automatically to the event. What had happened was that the Context logic was broken and it was in some scenarios attempting to access the OperationContext from a service which didn't exist which threw an exception before the event was written. This in term caused the app to automatically forward to the error page (this page!) which then tried to write the new exception and caused exactly the same thing to happen, falling into a tight loop. The way it should have been written was:

where it would have been obvious that WriteEvent consumes the context and which would have given us a clue as to what might have not been set. As it was, the 'global' access was hiding this fact (the WriteEvent was in an external library so not obviously available).
In short, don't use globals!

Friday, 26 August 2011

If you are writing a web application by yourself or in a small team, the temptation is usually to get it finished as soon as possible because even small sites don't take 5 minutes and once it is finished, there is a sense of satisfaction. The problem with this is that many things aren't really considered early on. Sites like Twitter didn't expect so many people to sign up so quickly and if the site wasn't able to scale, people would have got fed up and left or gone to a rival site. We should be thinking about scalability early on in the design and production cycle.
You might decide that due to the nature of the system, it is unlikely ever to exceed a certain size. Perhaps it is an internal company site and would never require more than, say, 1000 users doing simple things. If this is the case, fine. However, any public site runs the risk that it will become popular and an increase in user numbers means an increase in database, memory, network and processor load. If this is the case, you should design in the ability to scale as early as possible, even if your site will not be scaled initially.
One of the realistic possibilities is that you will need multiple database servers and if you need the ability to write to more than one of these at the same time, you will need some way to replicate the data. The difficulty with this relates to the actual design and might not be too bad.
For example, take the most basic replication issue of multiple inserts into the same table. If you have an index column (which you should) and you create user John Smith with ID 1000 in DB1 and then someone else creates user Michael Jones with ID 1000 in DB2, what happens when these databases replicate? In this case, although we might presume that the different names are different people, it would be possible for 2 John Smiths to create an account at the same time (in the gap between replication cycles). In this scenario, the best idea is to use a GUID for the id instead of an integer. Since these are globally unique, you would never have 2 users with ID 1000 on 2 different DBs. This also applies to foreign keys which would also use the GUID and would not be in danger of incorrect linking when copied from one table to its counterpart in the second database.
Updates can be more tricky. If someone sets a piece of data in the row for John Smith in DB1 at the same time as someone changes it in DB2, replication will do 1 of 2 things. If different fields have changed and this can be detected, it will simply take both changes and merge them. If the same field has changed or there is no way in your DB to detect which field has changed, you have the following options:

Flag it to a user who needs to manually decide what to do

Accept the low risk of this happening and simply take the latest change as the correct one

Create a manual system which records the changes so they can be automatically or manually merged

Apply a design which avoids the chance of this happening.

Really, to design out of these scenarios is always best because it is less maintenance. In our example, suppose we run a site like Facebook. One way we could avoid the problem is only to allow writes on a single database server, the others are read-only. The second option would be to ensure that updating user "John Smith" is only permitted on a specific 'home' database in which case multiple changes are applied sequentially and are replicated to other databases correctly.

Really the options depends heavily on the type of application you are creating. You might need to perform some load testing on your environment to find out where it is stressed under high-load. Don't spend time improving things that are working OK.

Tuesday, 23 August 2011

It's quite fashionable nowadays to talk about DR (disaster recovery) or what happens when something goes wrong. For lots of people though, they can only think about backups and the money to buy a new server if it breaks but DR has to encompass much more than this. It is all very well having backups but do you ever restore them to make sure they work? There is too much at stake to assume it will be alright. What happens after you make an upgrade to your database servers or web servers? Do you restore the backups again to make sure they still work? You might have UPSs on your servers but what happens when the power goes off? Do you assume it will not be off for more than 5 minutes? Do your UPSs cause the servers to shutdown so they are left clean or do you just wait for the UPS to run out and the servers to die? What would happen if your offices burned down? Would you be able to continue business or would all of your important information be lost?
Fortunately, it isn't rocket science, you simply need to perform a fault analysis of every element in your network (not every workstation but a workstation), servers, power supplies, UPSs, backups, internet etc. and then consider what would happen, how you would know and what you would do about it.

I read a great computer science book once (whose name I can't remember!) which talked about having certain 'rules' when we write software and one of them was, "when a defect occurs, it should be obvious where the fault is and it should only need fixing in one place". This rule implies encapsulation and more importantly, well defined and thought out fault handling.
Ignoring the debate of exceptions vs other error handling scenarios, it is obvious that most software still does not have a very well thought out system of fault handling. I have to support a live system and many defects that occur are related to specific scenarios and are extremely difficult to track down. Some of them we assume are environmental because we cannot see any path that would make sense yet all of these should be obvious. If something encounters an unexpected scenario, it should log that or display an error (or both). I think the underlying problem is that we might review code for neatness and main logic being correct but we rarely review code to look for what might go wrong. How often have you asked what errors a service call might throw (does the developer even know?) how are you going to handle that.
We can save so many hours or even days with a simple error like "X failed because the Y object was not populated" but we need a process to ensure this happens. Leave it to people and they will forget or not bother because of the pressures of implementing functionality.
The other big saver is trying to make a method either a list of calls to other methods or some sort of functional work and not both. This allows sequences to be easily reviewed visually and for functional methods to be small and concise and easier to re-factor.

Sunday, 14 August 2011

When we design software systems, we tend to think of perfect real world scenarios where all connections succeed, where services are always available and where the network is always available and running quickly. Sadly, many of us don't really consider what happens if some of these are not true at any given point. Perhaps in our system, it is acceptable to simply allow an error or timeout and for the user to hit refresh (as long as this actually works!) but for some systems, especially ones that require many screens of information to be navigated and saved into the session, this might not be adequate.
The company I work for writes a system to apply for mortgages. Various screens of information are captured and various service calls are made along the way. We had a defect raised the other day which appears to have occurred because something took slightly too long and the page displayed an error. The problem was, the system was then left in an undetermined state. The user interface was happy to try the screens again but the underlying system had completed what it was supposed to do and therefore would not return a 'success' code. There was no way to easily fix the problem, it would have required some direct hacking of a database to attempt to force the case through. In the end, the customer simply re-keyed the whole case but this is not always easy and is not popular if it happens too frequently.
This was a case where we hadn't properly thought through the possible error conditions. In our case, the risk is increased due to the numbers of people using the web app so it is not surprising that these things happen. Not only does it not look too good but it takes time and money to investigate the problem even if we end up not being to fix it. It is also not great as an engineer guessing as to the cause in the absence of any other discernible problem.
The solution as with all good design is to consider all aspects of the work flow that are outside of our control. This means the server(s) that is hosting the various parts of the system, the network and any third-party calls to services or other apps. We need to consider not only the timeouts, which in our case we do, but also what happens if something times out at roughly the same time the underlying system has completed successfully. This is especially true if you are using multi-threaded applications (which in many cases for web apps you will be). Ideally, your system should be able to go back in time to a known-good state from all pages but certainly those which are higher risk (which call onto third-party services etc) which should cancel and delete all current activity and take you back to a place that you know the system can recover from. This might be slightly annoying to the user but much less so than a full-on crash. You can also mitigate this risk by displaying a user friendly message such as, "the third-party service was unavailable. Please press retry to re-key the details and try again".

Friday, 17 June 2011

Consider the scenario, you have a software project which is dependent on 2 dlls from the same project built elsewhere. Simple scenario but already you have several things that can go wrong:

You might forget to update the dlls when they are changed

You might only update one dll when they are changed

You might deploy the application in a way that means the defect is not noticed until runtime

What can you do? You have several approaches you can take:

Hope it never happens (the most common and least effective measure)

You can write a process which reminds people to update dlls - not bad but people don't always read or understand the implications of processes

You can write a tool that updates things automatically - good idea but can require investment in the development for a simple scenario. Also how do we monitor whether it is working?

Write unit tests that can test for the presence of the correct items - a good idea generally if possible but how to ensure they are kept up-to-date and are testing all the right places?

You can design projects to not require such a dependency - best idea if it is possible without any other knock-on disadvantages or..

A combination of the above dependent on the exact requirements.

As I have said before, anything is better than nothing. There is no single solution because all decisions in life have pros and cons which need to be considered objectively rather than dismissing out of hand. How much damage could occur if the defect is injected. Charge that at £1000/$1500 per day and then how much to develop or create the solution? Even a minor bug can have knock-on effects, both causing other defects and also damage to reputation before you have even thought about fixing the defect itself. The fix is then another potential defect injection since systems are rarely tested to the same degree after a bug fix than they are initially. Basically, even a small defect fix, if you are going to fix it, costs thousands in real terms, money which you are NOT going to get from the customer. Get it right the first time since you will not have time to do it right a second time if you didn't have time to do it right in the first place!

Thursday, 16 June 2011

I have an unusual scenario for MVC entity framework 4. Firstly I have a table with a primary key (not used for foreign keys) and a unique constraint (which is used for foreign keys). Entity framework does not recognise unique constraints and therefore does not let you use them for foreign keys. The fix is to open the crm.edmx in an xml editor (you might be able to do it from the designer!) and then change the key to point to your unique field and not to your primary key. This allows EF4 to point foreign keys to your field. If you had already created the keys and then re-generated, you might have found that the associations have lost their referential settings which you will need to restore. Obviously if you regenerate, this change might be undone.The second issue is that although one of my table columns has a theoretical foreign key to another table, the key does not exist which allows us to test the scenario where an item is orphaned (the foreign key does not exist on the customer system), therefore EF does not provide all the wiring up for a drop-down list on my view to edit this 'foreign' field. The answer is to use a ViewModel which is simply a class that contains your entity and any other supporting data, in my case a list of SelectItem which is used in the DropDown list. This is populated in the constructor. Then rather than creating the view against the entity, you specify the view model instead and link fields to Model.EntityPropertyName.Name rather than just Model.Name. This also means you can simply link your drop down to the list of SelectItem in the view model which does not exist in the entity itself:

Monday, 6 June 2011

Web security for many people is quite simply one of the most important responsibilities that IT departments have. Even a simple forum site, if cracked, can expose passwords and email addresses which can then be used to access other sites - since many people use the same passwords for all their logins.It is a simple but important question: Who is responsible for application security? Perhaps it's not you because you are not paid? Rubbish. Not you because you are the manager and it is a developer job? Nonsense. Sadly, my experience is that there is a lot of assumption about whose responsibility it is but few people who will stand up and take responsibility when things go wrong. What is worse is that in many ways it is too late when a site has been hacked. Whoever is sorry becomes irrelevant.Think about some of the risks you have as an individual or company. The integrity of the firewall, the integrity of the network, the applications, the personnel management of people who might cause damage from the inside. Have you even thought about them and done a formal risk assessment? I doubt it. Most people have a very poor and non-methodical approach to security and then try and blame others when it goes wrong. You use an off-the-shelf product but do you keep it up-to-date? You use a lot of networking but are your staff really qualified to keep it secure? Can someone simply plug into your network switches and immediately gain access to your network layer?So many risks, so little control. Is it any wonder why even high-profile companies suffer from hacking? It is time we started taking this thing seriously. We need auditing, expertise and responsibility. We need people to own up to where their expertise is lacking and the compulsion from management to pay in order to put things right.I'm not going to hold my breath though. Contact me if you need any consulting on the security risks faced by electronic commerce.

If there is one thing I have noticed about software, it is that people tend to do something well up to a point and then let it all fall down with a "quick hack". In fact, there is no such thing because a quick hack immediately creates technical debt which invariably takes much longer to fix later on, if it is fixed at all. Once you have compromised in one area, it is then hard to try and keep the OO design pure in other areas and before you know it, your code has become a great ball of mud.For instance, at work we have a service which is used by more than one customer so it has a base part and then a customer-specific part. However, somehow, the base project has ended up with an id of type Guid which is actually customer specific but which goes through the design as if it was customer agnostic. The result? Well, now we are not clear where these Guids should live because another customer simply uses an integer for a customer ID and not a Guid. The design is polluted and other decisions have been made which are not good.One of the problems is that people are still taught in very structured, low level ways, to think about guids and ints and strings even though normal people don't understand such concepts. All they understand is a "product ID" or "text".When we design in OO, we must be merciless with abstraction. Even if our id is a Guid, we should make it an abstract type since the fact it is a guid is irrelevant, it is a unique key and could be abstracted into UniqueKey which can inherit or contain whatever it needs to. This way, the abstraction holds true when another customer uses text for a unique key because IT IS STILL A UNIQUE KEY!The thing is, we only learn these things after a mistake is made which is why it is important to train, to code-review and to retain people who are good at their job by recognising and rewarding them otherwise they will think the grass is greener on the other side and jump ship, leaving us with the average people who are not good enough to move jobs or who are not interested enough in work to care!

Monday, 23 May 2011

Error TSD03006: has an unresolved reference to object.This pain in the neck had occurred after creating a new VSTS project and adding what I thought was perfect SQL. The error pointed to Database.sqlpermissions and said the role I had granted a permission for did not exist even though it did!Well there were two things wrong in my project and one assumption which made it hard to track down. The fundamental problem was I had spelt authorisation with an S instead of a Z in my role SQL which made it invalid but my assumption was that errors are printed in dependency order (i.e. fix the first one to fix the rest). This assumption was wrong and the error for the role was the last one listed!The other problem was that even if your database properties say that your database is case-insensitive, the build environment will only recognise references that match case. For example if you reference a column called id with [Id], the reference will fail.

Wednesday, 18 May 2011

Do you know what DRY means and practice it? How do you code for the unknown? These two facts equate to the highest possible quality yield in software production but are often misused or not done at all. If things are left to individuals, they will be variable at best and absent at worst. If management however can insist on processes that include these, they will see high quality software produced.

A friend of mine once told me about a software test that consists of writing a function that takes 3 integers and determines whether the 3 values could form a valid triangle. The test is how many of the checks could you think of? How many do you think there are? One of them for instance is that the values cannot be negative but there are around 16 tests all of which provide complete coverage of the permutations possible. The point here is that coding for things we know about and can think of is easy, coding for those we can't forsee is impossible by definition.

So how do we code to cope with the unknown? Well Unit tests can be helpful but not in all scenarios because you have to set up the unit tests to assume a certain set of inputs for an expected output, proving just one or two cases from millions of possibilities is weak to say the least.

Experience and documentation can be useful. For instance, you know the range of values that an INT can take although in most cases negative, zero and positive are enough to test a method. What about nullable fields? Do you test for null? What about the max values? What happens if you pass Int32.MaxValue into a method which adds 1 to it, do you know what does/should happen? By writing down standard choice values for certain types, you can then build up permutations to use in tests. On this topic, it is also good to not allow fields to be null in an object if they are never allowed to be null in normal use. No need to test something that is illegal (unless it is to prove that the validation works).

The other way that works really well for coding for the unknown is keeping methods in small and manageable chunks. You can then put very simple constraints on it and 'know' that your code is bombproof since it will fail in an expected way if called with illegal data rather than simply crash. Imagine having something like

private void SomeMethod(int i, int j){ if ( i < 0 ) throw new ArgumentException("i", "i cannot be less than zero"); if ( j < 0 ) throw new ArgumentException("j", "j cannot be less than zero"); // Now we can do functionality with values we know are valid}

Another way in which you can help is by using intelligent types. If an integer must always be greater than 0, you could either use an unsigned type or even create you own struct called IntGreaterThanZero or similar which agains throws an exception if you ever try to assign an illegal value to it. In this case, you catch the error much earlier on in development and your method could become:

private void SomeMethod(IntGreaterThanZero i, IntGreaterThanZero j){ // Now we know that our method will succeed because the constraint will already be correct}

This brings me to the other point and that is DRY (don't repeat yourself)which says if you have to do the same thing more than once, you should probably re-factor. I don't mean that you cannot test for something being equal to true in more than one place but if you are doing carbon copies of the same code (or similar) then you are asking for trouble. Consider the following code:

How many things do you think are wrong with this code? Really the difference between poor code and good code comes down purely to your ability to code for known issues and to assume others that are not obvious.Firstly, the point here is that we have two methods that do different things but end up calling the same method and then both have to check for null and call something else. The problem here? If someone else calls the ServiceCallMethod, how can we insist that they check for null in the return value? If they don't, the problem could manifest much later in the code and might takes several minutes, hours or even days to track back to a bad service call. The point here being that we can push the null check down into the ServiceCallMethod and it becomes impossible to call the method without checking for null (this assumes someone doesn't just call the service directly somewhere else but that still comes back to DRY). This is not the whole story though. What else is missing?

Consider the bigger picture. Just from what you can see/understand from the code, there is one big unknown and that is what will happen when you call the service. It is not just a case of succeed or fail but it could throw an exception, it could return null or some other populated or semi-populated object, it could return a whole host of errors which might be related to network, security or the service itself. Currently none of these are checked or catered for. You might say that these situations should simply allow the program to die but that is naive and actually a wrong assumption. Leaving exceptions to fly up the stack to the program can lead to either other behaviour actually failing and leading you to fix the wrong area or otherwise be masked by a catch statement which might re-throw or not. Where exceptions are possible, they should be caught and logged properly, it is up to you whether you then rethrow it or perhaps retry, show a helpful message to the user, "Sorry, the remote service cannot be contacted" or something else. You cannot always tell what exceptions are thrown by a method call so by catching those as soon as possible, it allows the people calling the method to know what to expect and to deal with it.

Tuesday, 3 May 2011

Well I say debacle but to be fair probably only because you expect someone as large as Sony to have all the best security experts but it again underlines something I have mentioned many times before. The security issues are well known and documented and have been for a long time, the problem is that most companies do not have the processes or quality control to actually audit these things. People make software updates, buy-in third-party software, contract out parts of the system to others who they cannot guarantee and all sorts of things, these are day-to-day realities of IT companies but yet so many organisations are simply too lax with security. At least the data that was stolen (or potentially stolen?) from Sony was encrypted so it is perhaps unlikely to be cracked since cracking encryption is extremely difficult if you don't know what encryption method has been used. The problem with credit card numbers is that you know the answer is either numeric or numeric with dashes so you have a crib to the solution. On the other hand, if you ensure your columns are not called CreditCard and ExpiryDate but something esoteric like CX25 and EX54 and possibly even padded to a fixed length then people are unlikely to be able to deduce what the data is that they have stolen!

Even basics like locking out accounts that have been accessed too many times or with multiple incorrect passwords makes it so much harder to brute-force attack systems.

I've been using Windows Workflow this year and to be honest, I really like it. Well, I like it in theory but there are a few things that are buggy (I am using version 3.5) and which could easily cause me to not use it if I had a choice:

The projects take an age to load and compile. VS2005 and about 100 projects but still, it is virtually unusable and I would imagine that many organisations have projects larger than this. I have a quad core with 8Gb so I would expect it to fly.

If you make the name of an activity the same as the class name, it will not tell you that it is invalid. What actually happens is that it will create a new member variable with the new name but keep the old name as it is so it looks unchanged. The newly created member will be unreferenced. Not sure why the names can't match.

I have quite a few problems related to VS2005 crashing and shutting down and this coupled with the time it takes to load the project is impossible!

The Toolbox takes about 2 minutes to populate (open the first time) with all the activities in my solution. I tried deleting a load of them from the toolbox so I can just have the ones I need but for some reason, they are automatically all added in again when you build.

For state-machine type systems however, Workflow is definitely worth the hassle because it reduces the amount of code you otherwise have to write to produce a pseudo state-machine in normal classes. The designer is pretty usable too. Just make sure you understand dependency properties and the correct levelling of activities otherwise you will end up with a hotch-potch of non-reusable components that all require code activities everywhere to tie them together.

Tuesday, 26 April 2011

Those of us who have programmed for a while are fairly familiar with and comfortable using the if statement in code but the if statement if mis-used can be both confusing and a source of latent defects, that is defects which are not realised for a while and then jump out and bite you. What is the distinction between a good and a bad if?A good if statement should always be valid whatever happens to the system so a method which creates a type of class to manage a port might look like:

If you think about it, although we might decide to use a different port in the future, the logic always holds true that if the type is TCP, we will create a TCP port. If we need to expand the functionality of the port by extending its interface, we can do so without 1) touching the if statement and 2) invalidating the logic of the if statement. Ideally, we would then have an else statement which might throw an exception to ensure we are using a valid port type.Now consider the logic below in a classic if statement:

Can you see what the problem is here? There is an assumption buried deep beneath the code that says a message which is a string ALWAYS has an S as the first character and a message which does NOT have an S is NEVER a string. This might seem perfectly reasonable and correct but what happens when somebody decides in 3 years time that X represents an extended string. These new messages will fail the logic test but possibly in a way that is not obvious. This is a very easy to create defect which might well pop its head up at the customer site!I am not a Puritan but I believe in making code as reliable as possible. We would not accept an aircraft designer thinking something is a quick hack but yet we allow it in software because perhaps we don't give it the respect and effort that should be given to it. Of course, it would be unfair to bring this issue up without telling you about alternatives. Ultimately there are three things that can really help you 1) The compiler is your friend - use constructs that the compiler can check 2) OO design allows you to embed logic into classes and 3) Do not repeat yourself. If you need to ask the same question in multiple places, you are asking for one or more of them to fall out of sync.Here is an example of using the compiler and OO to help you code logic strongly:

LoadMessage();message.PreProcess();// Carry on

Here we introduce a class to represent the message (it might only be one class to begin with) and we create a method that is fairly generic and allows us to preprocess the message. In our case, we probably want a class to represent a normal message and one that represents string messages (which could inherit from the normal one). In this case, the string message PreProcess allows us to do whatever it is we needed to do inside this message with NO logical branching. Adding new messages with inheritance allows us to force people to consider the various methods that are available by making them abstract/pure virtual.This leads to the 3rd principle which says in one place, and one place only, I will use some sort of data loader to decide which class to load for any new messages I create, this follows the safe method that the logic is always true since by definition, if I add a new message, I either have to extend or modify the logic in a single place to load my new handler class and this will remove the chance of a bug:

The important thing being that the logic is in a single place and not buried somewhere down in a method. Using the exception mechanism for defensive programming is then an easy and quick way to fall over during testing if something has been forgotten.

Click here for a great article about how to get your 7711ln Wireless LAN network (PCI) card working in Linux. I am using Kubuntu 10.10 (kernel 2.6.35-28) and installed the card as normal not really expecting it to not work! Fortunately, EdiMax provide the driver on their website which is in fact a Realtek driver for the 3562 chipset (the driver actually builds for various flavours including this).Ignore what it says in the readme file which is very complicated and confusing when confronted with all the options and simply follow the instructions on the link. You can ignore any errors about the lack of the /tftpboot directory.In the line which says

sudo ifconfig ra0 inet up

, replace ra0 with the interface name for your wireless card which might be different. Type ifconfig by itself on the command line and you will see a list, probably including eth0, lo and either ra0 or wlan0/1. ra0 or wlan0/1 is your wireless card.If you have a newer Linux, once you follow the instructions and reboot, you might find that the card is still not seen by the Network Manager in the system tray (you won't see any wireless networks) and that is probably because by default Linux will load the rt2800pci (old built-in) driver and this will take priority over the one you just built. To find out, type lspci -v at a command line and see what is written next to the entry for your wireless network card. It will start with something like:

If you have the problem, the driver in use will show rt2800pci, in my case it has loaded the driver rt2860 from the driver I built myself contained inside rt3562sta.In order to force it to use your newly built driver, you need to blacklist the built-in driver. Simply open /etc/modprobe.d/blacklist.conf in a text editor (as sudo) and then add the following to the end (with a comment to help remember why you did it!):

Monday, 11 April 2011

Simple situation. I reference a dll in the web.config of a site. I clicked "Add reference" and browsed to the directory where the dll was and added it with no problems. When it came to checking into source control and building, I got "Could not find type ..." which was muchos confusing.It turns out that because the library I was referencing was the same version as one that I had installed in the GAC, it told the project to look in the GAC for the library rather than the location I had referenced. All very clever and very confusing. In my case I could remove the library from the GAC and reference it again. This time I got a refresh file (which tells the project where to start looking for the library) which I checked in for safe keeping and all was good in the world. Nice!

Monday, 21 March 2011

I discovered something that might explain why tests sometimes seem to fail after an individual test has failed for no obvious reason.Often in NUnit, you setup various mocks in [TestFixtureSetup] and then run one or more tests. In [TestFixtureTearDown], these mocks are disposed of. What is important to know is that if an exception is thrown during TestFixtureSetup, the TestFixtureTearDown is NOT called.What this means is that suppose your mock #1 redirects an endpoint to a locally hosted mock, if you setup mock #1 but while setting up mock #2 an exception is thrown, then mock #1 will not be disposed and this means one of two things might happen. Firstly, when you try and setup the mock again, it might fail on the basis that something is already "listening" at the endpoint it is trying to use. Secondly, if you don't dispose the mock, the endpoint is not reset back to what it should be. In this case, if another test does not use a mock but DOES use the endpoint for something then it will have been left pointing to your local mock site but with nothing listening on it. Your test might then fail because it gets no response to its request.To avoid all of this, add a catch to your TestFixtureSetup which will call Dispose on all mocks if they are not null in the case of an exception.

Monday, 14 March 2011

The CJuiAutoComplete in Yii is great but because it uses the standard interface, there is one thing that cannot be done directly with the Yii control, that is customising the list that is displayed when you type something in. In my last example was a simple list that just displayed the label and selected this when you clicked it (and stored an id number in a hidden field) but in my case, I am searching for church names and there would be lots of duplicates so the name of the church by itself is not enough.
There is an 'undocumented' feature of the jQuery autocomplete (which the Yii control uses) which is mentioned on various forums and which can be used to override the default behaviour. It is called _renderItem and which by default only prints the labels. This needs to be 'manually' linked to the jQuery object since it cannot be joined directly to the CJuiAutoComplete and this uses the Yii function registerScript(). Anyway, this is the code I used (see my earlier example for what all the other controls are):

Most of these names and code I left as per the example because I wasn't totally sure what they were. The only things I changed were the name of the script which is used as a dictionary lookup, the jQuery selector (#churchac) and the line which appends the <a> reference. This by default prints only item.label. In my case, all I did was add some commas and the street and city names. You should retain the <a> tags if you want the arrow keys and such like to work.

Update

I tried to use this recently and it didn't work (cannot set _renderItem on undefined). I don't know what has changed, it is possibly a timing issue in my code, but a workaround is to modify the _renderItem method for ALL instances of autocomplete by re-defining the main definition instead of only the one for this specific instance. You do this by changing line 2 from:

Thursday, 10 March 2011

I am generally pleased with Windows 7. It seems to perform well and is generally usable but there are some things that are annoying about it. At the end of the day, an operating system is just a platform to run productivity applications, it should not be any real beast in its own right and should not get in the way.My biggest complaints are changes. For instance, trying to get into the various options screens of Windows Explorer is horrifically complicated when you are used to the older way of Tools-Options, a fairly universal concept that millions of people were used to. Now they are spread across various unintuitive icons and menus which look like they were 'designed' by a child.Another thing which I come across a lot is the renaming of standard shortcuts so you can't find them. Most of us were used "Add or Remove Programs" which has now been renamed to "Programs and Features" which doesn't really say much differently except it is now in a different place in the control panel because it starts with P instead of A. Why are people allowed to change these? If you want to mention features, call it "Add or Remove Programs/Feature". When you test these things, you would think somebody would notice and complain.

Thursday, 3 March 2011

I got loads of errors like this in a VS2008 project the other day. It seemed really confusing since these errors related to files I had not even touched.It turned out that a dependent project had not been built - which had not stopped the solution building - which meant the project references of other projects had nothing to reference and therefore could not find the metadata.Looking up the output window to the first error showed what actually caused the problem and fixing this, got rid of the cascaded errors.

Monday, 28 February 2011

Firstly, we need to be clear. The memory I am talking about is called RAM, not the 'memory' which exists on your hard disk to store programs and files in. They are measured in the same way but RAM is more expensive and usually smaller, for instance 4 gigabytes (Gb - approx 4 billion bytes) whereas a hard disk might be 240Gb or larger.The point here is that a hard disk although cheaper is much slower because it contains moving parts, literally solid disks of data being read by a moving head. RAM is electronic and very fast.When you run programs on your PC, they will require a certain amount of program memory and this will be allocated from your PC RAM. However, if you do not have enough RAM (if you are already running too many programs and/or you don't have much RAM in the first place) then the computer will still allocate the memory but it will put it on the hard disk instead - a bit like putting your possessions into storage when you run out of room in your house.The data can be moved into and out of this 'virtual memory' but like the storage unit, it takes much longer to send it and get it back. This makes the computer slower because programs wait for the memory contents to come back before they can carry on. You can tell this is happening because the hard disk will spend lots of time spinning.As time goes on, programs require more and more memory, partly because they do more things and partly because people tend not to worry about memory requirements when writing software.The easy answer when this happens is simply to buy more memory for your PC. You can either search a site like crucial.com for what memory your PC takes or take the model number and make to a shop where they can check for you. Most memory is fairly standard and you should be able to get at least 4 Gb depending on whether your computer can access this much (after a while, there are too many spare rooms in the house and you have to go back to using storage). For most people using non-specialised software, this is more than enough memory to get good performance.1Gb will cost around £15/$25 in the UK so it is not much money to improve your PC.If your PC is ancient, you are probably better off buying a newer basic model for a couple of hundred pounds then all the components will be improved.

Followers

About Me

I work for PixelPin being in charge of all development for our company, which includes mostly .Net web applications but also PHP, Android and iOS programming as well as managing our hardware and cloud-based systems.

I live in Cheltenham, Gloucestershire in the UK which is lovely in the summer and miserable in the winter.