Hello,
we are using a MemoryStream to load a PdfDocument, clone a single page into a new PdfDocument and then convert this newer PdfDocument into an image (jpg) using the Save-method.
While the original PDF is fully readable, the resulting image has black “bars” in some locations:
As an example of said black “bar”.
The Pdf-to-image-conversion is successfull 9 out of 10 times.
For the conversion the following Code is used:
using (MemoryStream fileStream = new MemoryStream(Encoding.Default.GetBytes(pdfText)))
{
using (PdfDocument pdfFile = PdfDocument.Load(fileStream))
{
ImageSaveOptions imgOptions = new ImageSaveOptions(ImageSaveFormat.Jpeg)
{
DpiX = 150,
DpiY = 150
};
for (int i = 0; i < pdfFile.Pages.Count; i++)
{
try
{
PdfPage page = pdfFile.Pages[i];
using (PdfDocument document = new PdfDocument())
{
document.Pages.AddClone(page);
string imgPath = Path.ChangeExtension(pdfPath, i + ".jpg");
document.Save(imgPath, imgOptions);
imagesPath.Add(imgPath);
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
throw;
}
}
}
}
I have tried saving as a png-file → same result as with jpg-files.
I have played with the dpi Values, changed them between 150, 300 and 500 → no noticable difference.
I used the PDFPrintContent-Options to ensure that those black things are neither annotations, nor FormFields.
I even typed a quick Python-Program (using PyMuPDF) to convert the PDF to an Image to check if the PDF is inherently problematic, but with the Python-Program, the resulting Image was identical to the PDF.
Do you guys have an idea or a pointer where these bars stem from?
And ideally how to prevent them?
Best Regards
Andreas