I am developing high volume processing systems. Like mathematical models that calculate various parameters based on millions of records, calculated derived fields over milions of records, process huge files having transactions etc...

I am well aware of unit testing methodologies and if my code is in C# I have no problem in unit testing it. Problem is I often have code in T-SQL, C# code that is a SQL stored assembly, and SSIS workflow with a good amount of logic (and outcomes etc) or some SAS process.

What is the approach YOu use when developing such systems. I usually develop several tests as Stored procedures in a designed schema(TEST) and then automatically run them overnight and check out the results. But this is only for T-SQL. And Continous integration IS hard. But the problem is with testing SSIS packages. How do You test it? What is Your preferred approach for stubbing data into tables (especially if You need a lot data initialization). I have some approach derived over the years but maybe I am just not reading enough articles.

So Banking, Telecom, Risk developers out there. How do You test your mission critical apps that process milions of records at end day, month end etc? What frameworks do You use? How do You validate that Your ssis package is Correct (as You develop it)/ How do You achieve continous integration in such an environment (Personally I never got there)? I hope this is not to open-ended question. How do You test Your map-reduce jobs for example (i do not use hadoop but this is quite similar). luke

What makes you think that simply because you have automated tests that somehow that means it is Agile? We've been doing automated tests long before the words agile, extreme programming, TDD, SCRUM entered the software vernacular.
–
DunkMar 7 '11 at 20:28

No tests are not the only thing, but they are one of the fundaments, the other are Agile design, continuous integration and many others (like short dev cycles). I am mostly interested in tests (because the show the design) but also ask questions about other paradigms. And because I think that software verification is most difficult and important in such software. IT's not just about test automation, it's about designing software in a TESTABLE way and easily providing test data that validates it.
–
luckylukeMar 7 '11 at 21:16

2 Answers
2

For testing stored procedures, loading tables with data and rolling things back, I've found very little to compare with TSQLUnit. I've used this on several projects (including some for financial services customers) and I've found that it's well worth the effort.

For testing SSIS packages, have you tried ssisUnit? I have just started to play with this and haven't used it on a real-life project so can't comment on what its pitfalls are, but it seems to fill the gap that exists in testing SSIS packages.

You can always try to generalize, but you can't generalize anything. Agile practices are good, but still can't be applied to eveything. Also, there are no silver bullets ;)

I think you have hit a sort of wall here, and obviously you are asking a theorical question, because in practice you are trying to use agile techniques on mission crytical systems... and that's weird because development objetives (or qualities) of such systems are within ensuring that data is close to zero error.

Agile methodologies ideal enviroment lies within small and highly variable project, not with error-less systems nor highly efficient designs (as you state in your problem). Even if you consider "agile techniques", it is always advised that you keep your design simple, but that would be of no use to you.

So, how will i approach this requeriment? Trough engineering methods:

Create a develop/testing enviroment, wich you can use whenever you like

Reduce data size, but keep data variability (generalize registry and data types) on the development enviroment

Do not program, just create simulations and concentrate on the algorithm and it's complexity

Just before you feel you have really getting somewhere with your improved design, try it on production, but don't waste time trying to force daily releases and such, this just isn't an agile problem.