Jun 30, 2013

Q. Can you write sample code as to different ways in which you can validate an input String to return true if the input is alpha numeric and otherwise return false?A. It can be done a number of different ways.

Using regex.

Without using regex.

Firstly, using regex

Step 1: Have the right dependency jars in the pom.xml file as shown below.

Step 2: Download Jaspersoft iReport 5.0.0, and execute the ireport.exe from the iReport folder (e.g. C:\ireport\5.0.0\iReport-5.0.0\bin). You need to design the template now using iReport 5.0.0 and generate the person-template.jrxml, which will be used in Java as the template to generate the report.

Step 3: Tell iReport where to find the classes by defining the class path via Tools --> Option, and then select the "Classpath" tab.

Step 4: Create a new report via File --> New.

provide the name and path where you want to generate the template jrxml file.

Click on "Next" and then "Finish" to get the designer screen where you can add labels and text fields for the report.

Step 5: Add the column header for the report by dragging and dropping the "Static Text" for the column headers.

Step 6: Before you can map the Text Field with bean data source values, you need to bring in the Person java class we created in Step 1.Click on the little data source icon as shown below to get the pop up that allows you to define the com.mycompany.app.jasper.Person bean and select the firstName, surname, and age and select on "Add selected fields" and then click on "OK".

Step 7: Now you can map these fields to the new "Text Field" that you drag and drop as shown below. This will be done in the "detail1" section.

Step 8: Now, you need to map each Text Field to the corresponding fields of Person.java. You do this by Right-clicking and selecting "Edit expression" on the field.

Step 9: In the "expression editor", you can map the field. Remove the default $F{Field} and "double-click" on "firstName" to get $F{firstName} and click on "Apply" as shown below.

Step 10: Map all the fields as shown below, and click on the icon that compiles the design to generate the person-template.jrxml (text) and person-template.jasper (binary) files. If there are any compile errors, you need to fix it and re-compile. You only need the jrxml file to generate report in Java.

Step 11: Before you write the Main.java class, you need to have the relevant Jasper jar files. Here is the pom.xml file with the relevant dependency jar files.

Jun 28, 2013

Advanced Apache Camel Parallel Processing Tutorial

In the previous advanced Apache Camel tutorial we looked at sequential multicasting . In this tutorial, I will discuss multitasking with parallel processing. To multitask, you need to define an executor service first. This can be done via applicationContext.xml file as shown below.

As shown above, the file consumer uses the threadpool with the following attribute name and value scheduledExecutorService=#scheduledExecutorService".

You can send messages to a number of Camel Components to achieve parallel processing and load balancing with components such as

SEDA for in-JVM load balancing across a thread pool. The SEDA component provides asynchronous SEDA behavior, so that messages are exchanged on a BlockingQueue and consumers are invoked in a separate thread from the producer. Note that queues are only visible within a single CamelContext. If you want to communicate across CamelContext instances (for example, communicating between Web applications), see the VM component.

The VM component provides asynchronous SEDA behavior, but differs from the SEDA component in that VM supports communication across CamelContext instances - so you can use this mechanism to communicate across web applications

JMS or ActiveMQ for distributed load balancing and parallel processing. The JMS component allows messages to be sent to (or consumed from) a JMS Queue or Topic. The implementation of the JMS Component uses Spring's JMS support for declarative transactions, using Spring's JmsTemplate for sending and a MessageListenerContainer for consuming.

The Splitter from the EIP (Enterprise Integration Patterns) patterns allows you split a message into a number of pieces and process them individually. The split component takes the attribute "executorServiceRef". This attribute refers to a custom Thread Pool to be used for parallel processing, and notice that if you set this option, then parallel processing is automatically implied, and you do not have to enable that option as well.

Note: As soon as you send multiple messages to different threads or processes you will end up with an unknown ordering across the entire message stream as each thread is going to process messages concurrently. For many use cases the order of messages is not too important. However for some applications this can be crucial, and you need to preserve the order.

Q. What is a SEDA?A. SEDA stands for Staged Event Driven Architecture. This architecture decomposes a complex, event-driven application into a set of stages connected by queues. The most fundamental aspect of SEDA architecture is the programming model that supports stage-level backpressure and load management. Stage is analogous to "Event", to simplify the idea, think SEDA as a series of events sending messages between them.

Q. What does Apache Camel "event" component does?A. The event component provides access to the Spring ApplicationEvent objects. This allows you to publish ApplicationEvent objects to a Spring ApplicationContext or to consume them. You can then use Enterprise Integration Patterns to process them such as Message Filter.

The key point to note here is the ability of the mock objects to verify if a particular method was invoked and if yes, how many times was invoked. This is demonstrated with the last two lines with the verify statement. This is one of the key differences between using a mock object versus a stub. This is a common Java interview question quizzing the candidate's understanding of the difference between a mock object and stub.

Jun 24, 2013

Excel spreadsheet to generate SQL

When you have some data in tabular (e.g. Excel spreadsheet) format and would like to insert into a database table, you need to write an SQL insert query. Manually writing SQL query for multiple records can be cumbersome. This is where Excel spreadsheet comes in handy as demonstrated below. A single SQL query can be copied down where the formulas get copied with incrementing column numbers.

The Excel concatenate character & is used to achieve this. The $means fix. $a1 means fix excel column A. When you copy the formula, the row numbers will be incremented like 2,3,4, etc, but the column will remain fixed to A. In the example below

$A$1 = first_name

$B$1 = surname

$C$1 = age

Note: Both column and row are prefixed with $, which means both are fixed.

The above Excel expression is easier to understand if broken down as shown below where the concatenation character & plays a major role in combining static text within quotes with dynamic formulas like $A$1.

Data comes from various OLTP data sources as shown in the above diagram

Transactional and normalized data is used for daily operational business activities.

Historical, de-normalized and aggregated multidimensional data is used for analysis and decision making (i.e. for business intelligence).

Data is inserted via short inserts and updates. The data is normally captured via user actions via web based applications.

Periodic (i.e. scheduled) and long running (i.e. during off-peak) batch jobs refresh the data. Also, known as ETL process as shown in the diagram.

The database design involves highly normalized tables.

The database design involves de-normalized tables for speed. Also, requires more indexes for the aggregated data.

Regular backup of data is required to prevent any loss of data, monetary loss, and legal liability.

Data can be reloaded from the OLTP systems if required. Hence, stringent backup is not required.

Transactional data older than certain period can be archived and purged based on the compliance requirements.

The volume of this data will be higher as well due to its requirement to maintain historical data.

The typical users are operational staff.

The typical users are management and executives to make business decisions.

The space requirement is relatively small if the historical data is archived.

The space requirement is larger due to the existence of aggregation structures and historical data. Also requires more indexes than OLTP.

There are a number of commercial and open-source OLAP (aka Business Intelligence) tools like:

Oracle Enterprise BI Server, Oracle Hyperion System

Microsoft BI & OLAP tools

IBM Cognos Series 10

SAS Enterprise BI Server

JasperSoft (open source)

The OLAP tools are well known for their drill-down and slice-and-dice functionality. Also they enable users to very quickly analyze data by nesting the information in tabular or graphical formats. They generally provide good performance due to their highly indexed file structures (i.e. cubes) or in-memory technology.

Q. What is an OLAP cube?A. An OLAP cube will connect to a data source to read and process the raw data to perform aggregations and calculations for its associated measures. Cubes are the core components of OLAP systems. They aggregate facts from every level in a dimension provided in a schema. For example, they could take data about products, units sold and sales value, then add them up by month, by store, by month and store and all other possible combinations. They’re called cubes because the end data structure resembles a cube.

Jun 21, 2013

JPA interview questions and answers and high level overview - part 2

Q. What is an EntityManager?A. The entity manager javax.persistence.EntityManager provides the operations from and to the database, e.g. find objects, persists them, remove objects from the database, etc. Entities which are managed by an EntityManager will automatically propagate these changes to the database (if this happens within a commit statement). These objects are known as persistent object. If the Entity Manager is closed (via close()) then the managed entities are in a detached state. These are known as the detached objects. If you want synchronize them again with the database, the a Entity Manager provides the merge() method. Once merged, the object(s) becomes perstent objects again.

The EntityManager is the API of the persistence context, and an EntityManager can be injected directly in to a DAO without requiring a JPA Template. The Spring Container is capable of acting as a JPA container and of injecting the EntityManager by honoring the @PersistenceContext (both as field-level and a method-level annotation).

Q. What us an Entity?A. A class which should be persisted in a database it must be annotated with javax.persistence.Entity. Such a class is called Entity. An instances of the class will be a row in the person table. So, the columns in the person table will be mapped to the Person java object annotated as @Entity. Here is the sample Person class.

Now let's see how we can use notepad++ find/replace function to add quotes around each entry using its find/replace with regular expressions as shown below.

As you can see

Find what regular expression is: ([^,]*)(,?) , which means 0 or more characters but "," as first group stored in "\1" followed by 0 or 1 "," (i.e. optional ,), and grouped as \2.

Replace with regular expression is: "\1"\2 where " is added then followed by \1, which is the value captured like BA555 and then followed by ", and followed by optional ",".

Similar approaches can be used for formatting other bulk records like removing spaces, adding quotes, replacing new line characters with tabs, replacing tabs with new lines, replacing commas with new lines, etc.

Jun 19, 2013

JPA interview questions and answers and high level overview - part 1

Q. What is a JPA? What are its key components?A. The process of mapping Java objects to database tables and vice versa is called "Object-relational mapping" (ORM). The Java Persistence API provides Java developers with an object/relational mapping (ORM) facility for managing relational data in Java applications. JPA is a specification and several implementations are available like EJB, JDO, Hibernate, and Toplink. Via JPA the developer can map, store, update and retrieve data from relational databases to Java objects and vice versa.

Q. What is the difference between hibernate.cfg.xml and persistence.xml?A. If you are using Hibernate's proprietary API, you'll need the hibernate.cfg.xml. If you are using JPA i.e. Hibernate EntityManager, you'll need the persistence.xml. You will not need both as you will be using either Hibernate proprietary API or JPA. However, if you had used Hibernate Proprietary API using hibernate.cfg.xml with hbm.xml mapping files, and now wanted to start using JPA, you can reuse the existing configuration files by referencing the hibernate.cfg.xml in the persistence.xml in the hibernate.ejb.cfgfile property and reuse the existing hbm.xml files. In a long run, migrate hbm.xml files to JPA annotations.

Q. What is an EntityManagerFactory and a Persistence unit?A. The EntityManager is created by the EntitiyManagerFactory which is configured by the persistence unit. The persistence unit is described via the file "persistence.xml" in the directory META-INF in the source folder. It defines a set of entities which are logically connected and the connection properties as shown below via an example.

Usually, JPA defines a persistence unit through the META-INF/persistence.xml file. Starting with Spring 3.1, this XML file is no longer necessary – the LocalContainerEntityManagerFactoryBean now supports a ‘packagesToScan’ property where the packages to scan for @Entity classes can be specified. The snippet below shows how you can bootstrap with or without persistence.xml.

Jun 18, 2013

Advanced Apache Camel tutorial

This advanced tutorial extends part-2. The first route (consumer) polls for csv files and once a file is found in the c:/temp/adv folder, a sequential route is created by placing the info to two in memory queues via the "direct" component. Firstly, direct:email consumes the exchange data and generates an email with empty body to notify user that a file has arrived. The second consummer direct:transfer receives the exchange, and transforms the document to another csv using the PersonMapper class depending on if the conversion succeeded or not, the file is written to either converted or rejected sub folder. The key thing to make note is that how body and header information are passed back and forth between a n XML based routing in Spring and Java beans.

Jun 17, 2013

JUnit with Mockito tutorial

Unit testing is very important in development life cycle, and you can expect questions regarding mock objects and unit testing in general. This post extends the blog post relating to writing a mapper class for Apache Camel. This post writes a JUnit test with Mockito for the PersonMapper class.

Jun 14, 2013

The power of regular expression and testing tools like regexpal.com

Regular expressions are also known as regex, and they are very powerful and used widely in JavaScript, Unix scripting, development tools, monitoring tools like Tivoli, and programming languages. So, it really pays to have good knowledge of regexes. This post is about a handy tool named "regexpal.com" that helps you not only learn regex, but also to quickly verify your regexes while working on a project. If you google for "regex online validator" you will find other similar online validating tools.

Step 1: Open up a browser and type regexpal.com on the address bar. Yo will get the online validator.

Step 2: Type or copy/paste the regex at the top window and the text to apply the regex on the second window.

As indicated in the footer, use RegexBuddy for a more powerful regex testing. The regex keywords are highlighted in blue. Also the matched phrase is also highlighted. In blue for the exact match and in yellow for any other string match.

The above example matches any string followed by one of "Monitoring", "MYAPP", or "CASHFORECAST". The "|" in regex means "OR". What if you want to literally match "Monitoring|MYAPP|CASHFORECAST"? You can escape "|" with "\" as shown below.

Now, the "|" is not highlighted in blue as it is escaped.

Change the text as shown below. The "CASHFORECAST" is changed to "TRADEFORECAST" and match is no longer found as the text is not highlighted.

Jun 12, 2013

Scenario based question -- Designing a report or feed generation in Java

Scenario based and open-ended questions like this can reveal a lot about your ability to design systems.

Q.If you have a requirement to generate a report or a feed file with millions of records pulled from the database, how will you go about designing it and what questions will you ask?A.The questions to ask are:

How to display or provide the report. For example, online -- synchronously the user expects to see the report on the GUI or off-line -- asynchronously by sending the feed/report via an email or any other notification mechanisms like SFTP after generating the report in a separate thread.

Should we restrict the online reports for only last 12 months of data to minimize the report size and get better performance, and provide report/feed for data older than 12 months via offline processing.

Should we generate both online and offline reports asynchronously, and then for the online reports have the browser or GUI client to poll for report completion to display the results on the GUI. Alternatively can be emailed or downloaded via web at a later time.

What report generation framework to use like Jasper Reports, Open CSV, XSL-FO with Apache FOP, etc depending on the required output formats.

What is the source of truth for the report data -- database, RESTful web service call, XML, etc?

How to handle exceptional scenarios -- send an error email, use a monitoring system like Tivoli or Nagios to raise production support tickets, etc?

Security requirements. Are we sending feed/report with sensitive data via email? Do we need proper access control to restrict who can generate what for inline reports?

Should we schedule the offline reports to run during off peak?

Archival and purging of the older reports. What is the report retention period for the requirements relating to auditing and compliance purpose? How big are the feed files and should they be gzipped?

The above scenario can be implemented in a number of different ways.

Firstly, using a simple custom solution.

In this solution, a blocking queue and Java multi-threading (i.e an Executor framework) can be used to asynchronously produce a report. Alternatively, you can use asynchronous processing with Spring.

Secondly, an Enterprise Integration Framework

like Apache Camel can be used to create an asynchronous route. The high-level diagram of a possible solution using the Apache Camel. This framework is written to address the Enterprise Integration Patterns (i.e. EIP).

Apache Camel is awesome if you want to integrate several applications with different protocols and technologies.Spring Integration framework is another alternative. There are a number of tutorials on Apache Camel in this blog to get started as it will be a very handy skill to have to solve business problems and convince your potential employers.

Finally, using an Enterprise Service Bus (ESB) like web Methods, Tibco, Oracle Service Bus, Mule, etc. Mule is an open source ESB. There are pros and cons to each approach. More on these topics can be found at

Jun 10, 2013

Testing RESTful web services with the cURL command line tool

The modern web application development is full of RESTful web services, and it very handy to know a few tools to test the RESTful web services.

Q. What is a cURL command line tool?A. RESTful web applications are widely used and developed in many languages including Java, and cURL is a command line tool to quickly test RESTful web service functionality. cURL is a Unix operating system based tool. Here are some examples,

Note that this is a HEAD request, and -X is used to define the HTTP method or verb like HEAD, GET, POST, PUT, etc. The following example shows testing a health check URL to see if the RESTful web service is up. "-i" is used to show the response headers. "-H" is used to pass the request headers with the request.

Note: The resource uri needs to be quoted if you pass in multiple query parameters separated by ‘&’. If you have spaces in the query values, you should encode them i.e. either use the ‘+’ symbol or %20 instead of the space. Also, For GET requests, the -X GET is optional.

Here is an example of a POST request: "-d" is used for the data to be posted

The PUT request is done very similar to a POST request as shown above, but with -X PUT. A POST request is used to create a new person record and a PUT request is made to edit an existing person record as shown below.

There are other good tools to test RESTful web services like the Poster plugin from Firefox Add-on. Another great tool that I have blogged about is the RESTClient stand alone jar. These two are great GUI tools if you do not want to get down and dirty with cURL or if you are testing from Windows you could install Cygwin and then install and use cURL.

The JMX based MBeans provided by Apache Camel is useful for debugging and negative testings where you need to stop one or more routes. The JMX can be initiated via JConsole. We touched on using JConsole for

Step 2: Select "local Process" and the "StandAloneCamelWithSpring" as highlighted above, and then click on "Connect".

Step 3: You need to select the "MBeans" tab to see all the MBeans, and then one of the MBeans is the Apache Camel under which you have components, routes, consumers, context, endpoint, etc available at runtime.

Step 4: The operations to stop and start routes are performed via the "operations" option as highlighted below.

JConsole can be used to connect to other remote JVM processes by providing hostname, port, and login credentials.

By default, JMX instrumentation agent is enabled in Camel. This means that Camel runtime creates and registers MBean management objects with a MBeanServer instance in the VM. This allows Camel users instantly obtain insights into how Camel routes perform down to the individual processor's level.

The domain name of the MBean object can be configured by Java VM system property:

-Dorg.apache.camel.jmx.mbeanObjectDomainName=your.domain.name

Or, by adding a jmxAgent element inside the camelContext element in Spring configuration: