Extract images from specific posiotion?

Hello . I have a section with XY position and I need to extract all images inside this rectangle?
For example: I have a PDF file with 10 pages inside and I need to extract all images between x1,y1 and x2,y2 (by coordinates). May I achieve it with GemBox.Document?
In additional: How to find all images in Section N (for example 3)?

Hi John,

This task is more appropriate for GemBox.Pdf.

Please check the first two examples on the following page:

The examples demonstrate how to export images into files and how to retrieve the image locations.

I hope this helps.


Thanks. It works with PDF. But could you describe how to do that using DOCX?

If I have DOCX and want to know what has section 3 inside: images, text, tables…


Unfortunately, that information is not available for DOCX files.

You see, Word documents themselves do not have a page concept, they are of a flow document type, and the page concept is specific to a Word application(s) that is rendering it.

Here is a content model of the document:

As you can see, you can retrieve the Section elements and their child elements.
But those elements don’t have information about the page and the location they are on.

The flow document types (DOC, DOCX, RTF, HTML, etc. formats) define content in a flowable manner and when you open these documents in some application it must paginate and render its content.

There can be differences in rendering between different applications. For instance, you know how the same website can look different when it’s opened in different browsers, similar can happen with Word files when opened in different Word applications.

So what I’m trying to say is that on different applications those images may be located in different locations, even though it is the same Word file.
That’s why extracting images using the coordinates approach is not advisable for Word documents, instead, you should consider using something else (like surrounding the picture with a bookmark, or adding metadata to the picture, etc.).