Main Menu

How to Extract Text & Images Easily from MS Office Files

We may come across the need to extract images or text from an MS Word or MS Powerpoint file. Usually, this may include manual copying and pasting, one page at a time, and with mega-large files, this is going to take quite a bit of time.

Well, we have a simple trick to help you extract images and text from files of the new format ie DOCX, PPTX, XLSX whereas with files of the older format ie DOC, PPT, XLS, all you need is a free software to help you quickly and easily extract images.

Note: For the purpose of demonstrating this post, we will be using only an MS Word file. The process is the same for MS Powerpoint and MS Excel files.

How to Extract Images & Text from DOCX, PPTX, XLXS Files

Before following the steps, open the folder containing your files. click Organize > Folder and Search Options > View and uncheckHide extensions for known file types. Now, you can see the file extension with each filename.

Locate and select the file you want to extract images and text from (note: it is better to make a copy of said file). In this example, our target file is named Sample File.docx.

Press F2 to rename the file and replace the extension name with .zip.

A warning will be shown to confirm the change of the file extension. Click Yes.

Right click on the ZIP file and click on Extract files.

Locate and open the folder containing the extracted data and then open the word.

In it you will see a few folders and XML files. In the media folder you will find the extracted images. For the exracted text, open the document.xml file with notepad or XML Notepad.

Here’s what you will find in the media folder.

How to Extract Images from a Single DOC, PPT or XLS File

If you want to extract images from MS office files with older formats, the above method won’t work with the images. You need a free tool called Office Image Extraction Wizard for this purpose. The tool works with MS Office files as far back as 2012 and it works with one or multiple MS Office files in one go.

Choose the document you want to extract images from (for this example, we’re doing it to a folder I named Ch1.doc), and select the output folder. You can opt to have a folder created to house all your output images by ticking the option Create a folder here. Once you are done, click Next.

Click Start to begin the process.

Once the image extraction process is finished, click on Click here to open destination folder and it will open the output folder.

As you can see below, the program has created a Ch1 folder.

Inside the folder are the extracted images.

How to Extract Images from Multiple DOC, PPT or XLS Files

For extracting images from multiple files of the DOC, PPT or XLS formats, tick the Batch mode option found at the bottom left.

Click on Add Files and then select the files you want to extract images from. Hold the Ctrl button to select multiple files in one go. After selecting the files, click Next.

Click Start.

When the process is completed, locate and open the output folder. Here, you will see two folders with the original filenames. Open these folders to see the extracted images from their original MS Office files.

How to Extract Images with “Save as Web Page” Method

There is another method that will work with both newer and older MS Office files.

Open the DOCX or XLSX file and click on File > Save As > Computer > Browser and save file as Web Page.

Locate the folder with the filename you saved the Web Page in. Here, you will see all the images extracted from the file.