The easiest way to retrieve the text from a Section is like this:
string text = section.Content.ToString();
You can also iterate through the section’s child elements, like this:
var builder = new StringBuilder();
foreach (var element in section.GetChildElements(true))
{
if (element is Paragraph paragraph)
builder.AppendLine();
else if (element is Run run)
builder.Append(run.Text);
}
var document = DocumentModel.Load("sections.docx");
var picturesWithCaption = new Dictionary<string, Picture>();
foreach (Picture picture in document.GetChildElements(true, ElementType.Picture))
{
var paragraphWithPicture = picture.Parent;
var parent = paragraphWithPicture.ParentCollection;
int index = parent.IndexOf(paragraphWithPicture);
var paragraphWithCaption = parent[index + 1];
string caption = paragraphWithCaption.Content
.ToString()
.Replace(".", string.Empty)
.Trim();
picturesWithCaption.Add(caption, picture);
}
var myPicture = picturesWithCaption["Img4"];
File.WriteAllBytes($"output.{myPicture.Format}", myPicture.PictureStream.ToArray());
This will work for all those images except for the “Img2”.
That is because the “Img2” caption is not on the next paragraph, but rather in the same paragraph where the picture is.
In other words, the “Img2” text is side by side with the picture:
Perhaps an easier way for you to notice this difference is to change the size of that image.
You’ll see that the “Img2” text is no longer below the image, instead, it ends up on the right side of the image.