Let's start from the beginning:
My job entails running disaster simulations. The way we do it is that we have some hypothetical disaster happen and see what the reactions are from the groups/companies participating in the exercise.
The problem here is that when the exercise is finished, the output is a word file that contains all the answers from all the companies.
If that sounded too confusing here's an example:
*keep in mind that all the information is contained in multiple tables. One table per person per section*
Exercise:

section 1
person a table
person b table
person c table

and so on.

What I need to do is fix it so that there is only one person's answers on one document.

The way it goes now is: Master file and then I go by hand to create the different files for the different people. Which is a very long process especially when there are more than 40 people participating in the problem.

I was thinking that I could write a program either in Java or Python that would separate the different people into different files. I have some slightly advanced knowledge on both from classes I've taken.

Spookster

01-04-2012, 07:25 PM

Is the application supposed to be parsing an existing word document and generating the multiple separate docs or is the application going to have a GUI where you enter in all the information and then it generates the multiple docs?

MagicMeese

01-04-2012, 08:04 PM

I think that my boss would appreciate parsing the existing document and generating the separate docs.
The problem with a GUI is that I'm not that well-versed in that subject

Spookster

01-04-2012, 10:01 PM

There are many different approaches you can take. If I were doing this I would definitely avoid having to parse a word document and just develop a GUI where all the information is entered into and then generate the documents you need. Depending upon your needs it might also be beneficial to store the data in a database if you ever need to update or regenerate those documents at a later time.

If you go the other route the word document that you parse must strictly adhere to a template otherwise any changes in formatting of that document could break your parser. Im in that situation now with a parser we have written in Python that parses Excel spreadsheets. If you go with Python there are 2 libraries to choose from for working with MS files. The one I use is called win32com which just gives you complete access to all the COM objects and the other one is called PyWin which is not very well supported and is a custom interface to working with the COM objects so you don't always have access to all the COM objects, just what they support. It's been many years since I've used Java so don't know what libraries might be available for working with MS files.