Is it possible to get the bounds of each word (or even better, each character) in a PdfTextContent? I see there is a GetGlyphOffsets but that only gives me the x position of each letter. And if possible is it also possible to get a SpaceCharWidth value for that PdfTextContent (the .format.[text].WordSpacing seems to be 0)?
Note that this is a hidden (unlisted) version. To install it, you’ll need to run the above command on the NuGet Package Manager Console (Tools → NuGet Package Manager → Package Manager Console).
And try the following example that draws rectangles on PdfTextContent elements’ bounds and individual glyphs bounds:
static void Main()
{
using var document = PdfDocument.Load("input.pdf");
var page = document.Pages[0];
var elements = page.Content.Elements;
var boundsGroup = elements.AddGroup();
elements.Group(elements.First, elements.Last.Previous);
// Use page.Transform if not drawing bounds on the page but calculating the bounds on a potentially transformed (rotated) page.
// var transform = page.Transform;
var transform = PdfMatrix.Identity;
using var enumerator = elements.All(transform, flattenForms: true).GetEnumerator();
while (enumerator.MoveNext())
{
if (enumerator.Current.ElementType != PdfContentElementType.Text)
continue;
var textElement = (PdfTextContent)enumerator.Current;
transform = textElement.Transform * enumerator.Transform;
foreach (var glyph in textElement.Text)
{
var glyphBounds = glyph.Bounds;
transform.Transform(ref glyphBounds);
DrawBounds(boundsGroup, PdfColors.Green, glyphBounds);
}
transform = enumerator.Transform;
var elementBounds = textElement.Bounds;
transform.Transform(ref elementBounds);
DrawBounds(boundsGroup, PdfColors.Red, elementBounds);
}
document.Save("output.pdf");
}
static void DrawBounds(PdfContentGroup group, PdfColor color, PdfQuad bounds)
{
var pathElement = group.Elements.AddPath();
pathElement.BeginSubpath(bounds.Point0, isClosed: true)
.LineTo(bounds.Point1)
.LineTo(bounds.Point2)
.LineTo(bounds.Point3);
var strokeFormat = pathElement.Format.Stroke;
strokeFormat.IsApplied = true;
strokeFormat.Width = 0.5;
strokeFormat.Color = color;
}
Does that mean this feature will be released in the next general update or do I need to avoid updating the dll so as not to overwrite this hot-fix version?