2011 August

Yearly Archives: Knowledge Center

In this digital day and age, we all know that information, be it in any discipline, is most effective and useful when it is available in an electronic medium accessible to the audience at large. While print medium is still in existence, it has been completely dominated by digital information. A lot of visionaries, have commented on the importance of the digital information over the next decade and we’ve been seeing a lot of revolutionary products/devices and applications entering the market in this space over the last few years. We all obviously understand the primary advantages of digital books which include: ease of information access (24/7), minimal space to store and archive data, easy retrieval of information through search and other associated features, more secure data storage than in a physical location, to name some. There are some associated challenges of going digital; the primary one being how copyright issues are handled especially when digital copies can be easily duplicated and shared. Digital content providers are attempting to alleviate this situation by introducing Digital Rights Management to control and promote authorized access to information. While content providers / publishing firms are attempting to mitigate such challenges globally, no one is really denying or questioning the benefits of *going digital*.

In our experience as a QA specialist that has been testing digital content for over 7 years, we from QA InfoTech herein talk about some of our core QA approaches in ensuring such digital content is ready for global consumption and what happens behind the scenes during such a digitization process.

Firstly, content could be in various forms ranging all the way from a print copy, to a text file, a word document, a PDF to name some. Content could also contain plain text, or can have embedded images. The content to be digitized in such cases is undergoing a *conversion* while in other cases the digital content is created from scratch. If content is in print form, the first step in digitization is to scan it. Clearly, commercial scanners are used to scan volumes of print content that have been collected and preserved over years. One of the prime technologies used in content digitization is XML. XML forms the transitional state the content needs to be moved into, before it is output in an electronic format. With XML in use, it is very important for the content to adhere to defined standards based on which the content typically gets tagged with the required headers, sections, sub sections. This schema creation is the core in the process of content digitization. There are a lot of tools and automated solutions these days to take in the input file and create the XML file. OMNIMARK is an example of such a tool. Once such XML files have been created, processing engines come into play to consume the XML and create the run time output which is what users get to see. This could be in various formats such as epub, OEB, MobiPocket, FictionBook, Microsoft Reader (LIT) etc.

From a QA standpoint, content digitization verification happens at two stages. First, when the transitional XML is created and second, when the XML is ingested into the processing engine to create the prod3 runtime files. At the first stage, it is very important to verify that the XML has been created per the specification as most defects can be easily caught at this stage. XML tags, structure, adherence to the schema called out in the specification are the prime ones being verified, so the QA team needs to get involved earlier in the conversion cycle rather than after the digitization has taken place. There may also be scenarios where Source content could be a PDF file which needs to be extracted at an intermediate step using technologies such as Open Office, before the prod3 ingestion takes place. If so, the extraction is also a very important stage for QA.

The second stage is where testing happens at a more black box level. The runtime electronic content is created by the processing engine once the verified XML file has been ingested into it. Herein some of the core elements to verify include: Content appearance, clarity, completeness of information (no truncations), index, glossary, navigational links, navigational links leading to the right content, print features etc. Most importantly since content search is one of the major advantages of digitizing content a good amount of QA needs to be done to check for validity, accuracy and completeness of the search feature. Once such core functionality has been verified, the next thing to check for is compatibility on desktops and mobile devices such as laptops, phones, e-readers, tablets etc. This is important since content access on mobile devices is increasing by the day. We had recently written a blog on Content QA – is this really needed? Specific to content digitization, where existing content is being digitized, not much test effort needs to be expended on testing the content itself. The focus is only on testing the digitization process.

Besides functional and UI testing, we just talked about, other core forms of testing when content is digitized include: Security (especially in cases where Digital Rights Management has been enabled), Performance (whether content is hosted or implemented behind a firewall), Accessibility (especially when digitization has accommodated specific accessibility features such as content reading out and device support to enable access to physically challenged), Usability (to ensure the current flow of content is effective to the target audience), Globalization (only if the content is being localized at the time of digitization).

Having looked at the scope of Content Digitization QA at a high level, let’s take a peek at the scope of test automation. There is a lot of room for automation at the stage when XMLs are being verified. Simple unit tests could even be written in XMLUnit to verify the XML files that are generated. Automation at this stage is very effective since it is simple to write, easy to maintain, saves a ton of time for the tester and helps catch bugs early on which might be very expensive if missed at this stage. So, given how cost effective yet beneficial it is, this is a lucrative area to automate in content digitization testing. At the front end, while some pieces are best verified manually, there are a few areas such as checking for broken navigational links, performance and security testing that can be automated. Navigational links is one that can be very cumbersome to test manually and that yields good results with minimal amount of automation effort. Basic utilities can be written to check for working links and that they navigate to the correct locations. Like in any other product testing effort, performance and security are areas that are best tested and scalable only through effective use of automation, in content digitization as well. Given the volumes that are being handled when content is digitized, it also makes sense to create an automated regression suite if content is expected to undergo changes over the next revisions. If this isn’t the case, there is not a very high ROI in creating and automating such a regression suite.

Using the above defined testing process, trained test engineers who have handled several digitization QA projects, a right mix of manual and automated test efforts and content domain experts wherever necessary, we at QA InfoTech have provided Content Digitization QA services for several leading Content and Publishing houses. Content digitization is still in its nascent stages. A lot of evolution is still to come in terms of the process, associated technologies etc. and the scope for content digitization QA is huge. This is certainly a great area to build a niche given not just the scope of work but also the scope of specialized skills a company or an individual can build and we continue to invest in our people, processes and R&D efforts to further strengthen our edge.

Software development has become a very integrated activity over the last decade, where instead of one large piece of code developed in-house, a lot of small pieces flow and fall in place like pieces of a puzzle, to shape the E2E product. Such integrations and collaborations have been necessitated by a lot of drivers in the market such as: Need for faster time to market, leverage domain expertise from specialists rather than building everything in house, feature rich product, technology advancements that make such integrations seamless and possible. For a long time now, cross group collaborations within a company have been encouraged to promote resident experts in a specific area, ease of code maintenance, modular functionality etc. For e.g. in my projects at large ISVs, I have almost always seen such integrations in areas such as login functionality, payment processing, installation and setup. In the more recent years such integrations have extended bounds with ISVs now willing to integrate external code bases with theirs to compete and maintain an edge in the marketplace.

Besides some areas I’ve mentioned above, a few others that have always been excellent targets for external interfacing include analytics and reporting. Large ISVs have quite heavily depended on companies such as Omniture, Web Trends to incorporate core reporting modules into their software functionality. The world of application development is now opening several doors for such external integrations. Having looked at some of the striking advantages such integration brings, one cannot miss the inherent challenges in such third party collaborations. Let’s take a quick look at some of those and also the associated mitigation strategies.

Challenges and some mitigation strategies:

Challenge: Quality of Code – This is a huge challenge since you have a third party team that is working on giving you the source code. Very often, product teams’ face challenges in imbibing coding standards within their own team. In such scenarios, how do you ensure code quality from an external team where you do not have complete visibility into and control on?

Mitigation Strategies:
Define clear protocols and coding standards to use and ensure they are consistent with what is being used in house
Define metrics for measuring code quality and associate them with Service Level Agreements (SLAs) to reward (for met and exceeded expectations) and penalize (for expectations not met)
Define and implement quality gates/checkpoints, before integrating third party code into your code base

Challenge: Getting the external team to understand your business requirements, time, cost and quality constraints and effectively customize their code to meet your needs

Mitigation Strategies:
Involve the external team sooner than later in your product development life cycle and engage them in all important business and product conversations, including any field trips, client visits
Encourage free flowing yet managed communication channels between your business team and external team. I say managed as unless this is monitored, the entire flow can soon become very haphazard
Ensure transparency of communication at all times and insist such interactions from their end also to understand their product timelines, challenges if any
Get the third party team to spend adequate cycles on using your product hands on; unless this is done, they will only get a theoretical view of the product and will not be able to deliver a quality integration piece

Challenge: Defect Management – Disconnected defect management practices often result in huge gaps in timelines to fix defects, incomplete defect fixes, ineffective hand offs between teams etc. all of which lead to delays in project timelines and increase in project total costs

Mitigation Strategies:
Establish clear defect management protocols at the start of the project including acceptable timelines, entities involved in the hand shake between the two teams, tools to use etc.
Define service level metrics around number of defects, type of defects, severity of defects, defect fix timelines, regressions etc. at various stages in the product lifecycle
Get the third party team to spend adequate cycles on using your product hands on; unless this is done, they will only get a theoretical view of the product and product development and defect fixes will not be comprehensive

Challenge – Maintenance of external code, post the initial release is often a huge challenge

Mitigation Strategies:
Think about code sustenance upfront. If you want the external team to continue to own the code, understand and align their release cycles to yours, so their code is up-to-date in user features and functionality, use of technology, defect management etc.
If code is to be transferred to your team, ensure you have the right team identified for training and take over; have one person assigned from the external team for future questions and clarifications
Having talked about some of the core challenges and mitigation strategies, let’s now look at some vital quality gates that need be implemented in the product team in addition to the quality checks done by the external team before and after code integration.

Static analysis of code – This is important to ensure the code meets the standards defined by the product team and is safe and free of vulnerabilities before it can be integrated main stream. Code needs to be scanned especially from a security standpoint at this static level, to ensure no malicious snippets are being integrated. Such static analysis can either be done manually (not a scalable and fool proof option though) or with the use of home grown or commercial code scanners. Such scanners run through all possible execution paths of the code and spit out issues / errors (including critical ones), warnings, vulnerabilities etc. This is a great checkpoint before code integration happens and is also one of the right situations at which Service Level Metrics can be enforced

Dynamic Analysis of code: At the dynamic level when the code is run, the following checks become important at the black and white box levels:
API level checks – automated tests at the interface level and database level to ensure interfaces work fine, are secure, have the right schemas and are consistent with the core product architecture and technology
Performance Testing – check for configuration files, setup files etc. to ensure the right parameters are being used. If the integrated external code is not on par with the core code, it can often pull down the performance of the entire product. This is also one of the right stages for the product team to mandate SLAs to the external team
Functional testing and UI testing at the black box level to ensure the E2E functionality works fine, the UI is consistent overall and that the integration points are not apparent to the end user. Solid integration testing herein needs to be undertaken to ensure a seamless end user experience. If the external team has done specific customizations to align with your product line, such customizations also need to be tested for
Security Testing – This is one of the most important aspects to check for to ensure no vulnerabilities have been introduced, especially when third party code (on which the product team may not have complete control) is being integrated into your mainstream code

In a nutshell, third party code integrations are becoming the need of the day to enhance your core product’s functionality. Do not be intimidated by the challenges it poses. Be aware of them and attack them at the very early stages to fully benefit from such external code; for which, as you have seen above, quality assurance right from the early stages is a very important wand!

Join us on August 31st, 2011

at 10:00 AM PST; 1 PM EST

Presenters:

Anup Patnaik, Sr. Director, Engineering

Ramandeep Singh, Sr. Manager, Quality Engineering

Participation: By invite only

Framework driven Test Automation development, has been advocated by several companies and experts in the last decade. The productivity gains, efficiency in maintaining automation code and quality of test automation code have all been spoken about at length, but really how many have actually implemented such frameworks, how they have implemented, what benefits they have been reaping are all very specific to each such effort.Come join us for an hour long session and demo on our home grown test automation framework, built on open source technologies, helping us provide:– Cost effective and quality solutions to our clients– A great opportunity for manual testers to equally partake in test automation making it less expensive, fun and motivating for the entire test teamImplementing this framework assures you:– Faster test cycles, improved test productivity and more time available for testers to take on other test tasks– A more universal test automation effort that is understandable and maintainable by the entire test team, in an agile development cycle– Enhanced test coverage by leveraging virtualization that has been built in

About the Speakers

Anup Patnaik

As Sr. Director of Engineering, Anup is responsible for processes, tools and technologies across all of QA InfoTech’s projects. With deep insight into software engineering methodologies, and more than eight years of experience across industry and research, Anup works towards providing value based test solutions to all of QA InfoTech’s clients. He is also responsible for the test automation efforts and has been instrumental in developing key frameworks and tools that have propelled QA InfoTech into a league of a select few companies with the highest level of automated testing capabilities.

He is an active speaker and presenter at various conferences. He also works with universities and students to bridge the industry-academia gap and promote Software Testing.

Prior to joining QA InfoTech, Anup worked at i2 Technologies and DaimlerChrysler Research and Technology Center building products in Supply Chain, e-learning and automobile domains.

Ramandeep Singh

Ramandeep has been with QA InfoTech since its very early years in 2004. As a Sr. Quality Manager, he has worked on a varied set of projects and test efforts. His areas of expertise include test automation using open source solutions, testing application management technologies, localization functionality and web services APIs. Raman plays an active role in the company in enhancing the test automation solutions and evangelising them.

Besides being an active practitioner, Ramandeep is a hands on software testing and test automation trainer. He has trained several engineers in test automation at companies of the likes Intuit and Headstrong.