Read Sections in Word file

Hi.
I’ll describe of my issue:

  1. I want to load DOCX document
  2. Read all sections inside (for example 5 sections).
  3. Go to section#2 and do something inside.
  4. I need to manipulate with Section #2 only (not 1,3,4,5)

For example:

foreach (Section s in document.Sections)

            if (s == document.Sections[1])
            {do something: read or write};

Please, describe how to do that (syntax)?

Hi,

Try this:

var section = document.Sections[1];
// ... do something with Section element ...

Regards,
Mario

Could you describe, how to read all text inside section #2? Please full code

Hi,

The easiest way to retrieve the text from a Section is like this:

string text = section.Content.ToString();

You can also iterate through the section’s child elements, like this:

var builder = new StringBuilder();
foreach (var element in section.GetChildElements(true))
{
    if (element is Paragraph paragraph)
        builder.AppendLine();

    else if (element is Run run)
        builder.Append(run.Text);
}

Last, please browse through our online examples:
https://www.gemboxsoftware.com/document/examples
They are the fastest way for you to get familiarized with GemBox.Document API.

Regards,
Mario

Thank you Mario for your support.
May I ask you an additional question:

I have a docx file with 5 images inside and I need to find the image when there is a caption under it.

Please see. I need find the Text “Img4” and save the image above this inscription. Only one image

Thanks

Hi Dekan,

Please send us your DOCX file so that we can investigate its content.

Regards,
Mario

I’ve prepared the special link for you: WeTransfer - Send Large Files & Share Photos Online - Up to 2GB Free
Docx file. Thanks

Try something like this:

var document = DocumentModel.Load("sections.docx");

var picturesWithCaption = new Dictionary<string, Picture>();
foreach (Picture picture in document.GetChildElements(true, ElementType.Picture))
{
    var paragraphWithPicture = picture.Parent;

    var parent = paragraphWithPicture.ParentCollection;
    int index = parent.IndexOf(paragraphWithPicture);

    var paragraphWithCaption = parent[index + 1];
    string caption = paragraphWithCaption.Content
        .ToString()
        .Replace(".", string.Empty)
        .Trim();

    picturesWithCaption.Add(caption, picture);
}

var myPicture = picturesWithCaption["Img4"];
File.WriteAllBytes($"output.{myPicture.Format}", myPicture.PictureStream.ToArray());

This will work for all those images except for the “Img2”.
That is because the “Img2” caption is not on the next paragraph, but rather in the same paragraph where the picture is.

In other words, the “Img2” text is side by side with the picture:

Perhaps an easier way for you to notice this difference is to change the size of that image.
You’ll see that the “Img2” text is no longer below the image, instead, it ends up on the right side of the image.

I hope this helps.

Regards,
Mario