When modifying a document the dashes '-' do not copy

I have a PDF document I am converting from a 3up postcard to one for each page. You guys helped me do that previously. However, it was noticed recently that the dashes used for phone numbers and negatives for balances etc. are not being preserved. When I walk through the code I can see where the dash is being read, however, when it does the write to the new page, no dash is displayed.

I am working on a cleansed document to remove sensitive data so I can send one for you to look at. My concern is when I make changes to this document it breaks the document (what I mean is it then does not process through the my app correctly), however, that may not matter for this issue.

Hi James,

The problem occurs because those glyphs are mapped to soft hyphen Unicode characters (SOFT HYPHEN [SHY] {discretionary hyphen}) which are basically conditional hyphens.

To perhaps explain the issue differently, try opening your document in Adobe, copy that text, and paste it somewhere, like to a Notepad. You’ll see that those dashes are missing:

That is because this is an expected behavior for soft hyphens, they are sort of like placeholders for the “real” hyphens if they are needed, if the word that contains those characters needs to be broken.

So, the point is that unfortunately, the software that created this PDF has mapped that glyph to a wrong Unicode character. Probably the easiest way to resolve this would be to replace those characters with any other hyphen character, like with the regular ‘-’ minus character (U+002D : HYPHEN-MINUS {hyphen, dash; minus sign}).

using (var formattedText = new PdfFormattedText())
{
    string text = ...
    formattedText.Append(text.Replace('\u00AD', '-'));
    // ...
}

Regards,
Mario

Perfect. That worked.