How to add Index table (Table Of Content) to PDF

Hi,

I have one existing PDF file with the Index at the start.
The Index is in a table format. It has a Ref number, Name, and Page number.

Below is the example how my Index File looks like:

Capture

I want to add a Hyperlink on the name of each entry in Index which redirects to respective page number.
Basically I want to find that particular word (which may contain space) in the PDF file and add Hyperlink in that.

Is this possible with Gembox.Pdf?

Hi Yogesh,

GemBox.Pdf currently doesn’t provide API for hyperlinks.
Nevertheless, perhaps you could add those links using the low-level API.

However, the problem is with searching the “particular word”.
That text could be split differently between PdfTextContent elements.

For instance, try running this example Reading additional information about a text with your PDF.
It should show you all the text elements and their positions.

Anyway, to further investigate your exact requirement, can you also send us your PDF file?

Regards,
Mario

Hi

Thanks for the quick response.
I tried the example you pointed me to. But it gives the character list instead of the word.

Here is the sample pdf file according to my requirement.

Hi Mario,

Can you suggest what is the best way to create a PDF with Index table and linking them internally within the PDF.

Actually our requirements is let’s say we have three separate PDF documents. We are merging this three document into single PDF document and adding Index table at the begging. This index table should link to the respective page number from where the document starts as explained by @tgyogesh

We have 200+ documents to be merged and indexed in this way.

Can you guide the best way to generate index with linking within same pdf.

Your response will help us to validate Gembox is best for our application or not.

Regards
Amit Patel

Hi Yogesh,

Yes, I’m afraid that the software that was used for generating this PDF has written the text in such a way that each letter is in its own PdfTextContent element.
Because of this, searching and replacing a “particular word” is problematic.

Nevertheless, the requirement that Amit mentioned (creating an Index table) is easier to achieve.
I’ll create an example project on Monday that will accomplish this by using both GemBox.Document and GemBox.Pdf.

Regards,
Mario

Hi Amit,

Please try this MergeWithToc.zip sample project.

In short, the project creates a new document with a TOC element using GemBox.Document.

The TOC element is generated by adding “Heading 1” paragraphs with text that you specify for each PDF file (e.g., “Document Template” text for “MyDocumentTemplate.pdf” file).
The page numbers for TOC entries are generated by temporarily adding empty placeholder pages, they represent the pages of your actual PDF files.
After updating the TOC element, the TOC document is saved to PDF.

Now with GemBox.Pdf, the empty placeholder pages are replaced with the actual PDF pages, and destination pages from TOC links are also replaced.
After that, the resulting document is saved as a “Result.pdf” file.

Regards,
Mario

Hi Mario,

Thanks for your time which you invested in creating the TOC sample project.

The TOC element which you have used in sample project has fix appearance like below

Chapter1…2
Chapter2…5
Chapter3…8

But as I said in my initial question our requirement is to have index in table format as per the screenshot. And I guess this TOC element will not allow any custom modifications in it.

Hi Yogesh,

Try adding this ConvertTocToTable method after calling toc.Update():

static void ConvertTocToTable(TableOfEntries toc)
{
    var document = toc.Document;

    var table = new Table(document);
    table.TableFormat.AutomaticallyResizeToFitContents = false;
    table.TableFormat.PreferredWidth = new TableWidth(100, TableWidthUnit.Percentage);


    table.Rows.Add(new TableRow(document,
        new TableCell(document, new Paragraph(document, "Sr No.")) { CellFormat = { PreferredWidth = new TableWidth(15, TableWidthUnit.Percentage) } },
        new TableCell(document, new Paragraph(document, "Name")) { CellFormat = { PreferredWidth = new TableWidth(70, TableWidthUnit.Percentage) } },
        new TableCell(document, new Paragraph(document, "Page")) { CellFormat = { PreferredWidth = new TableWidth(15, TableWidthUnit.Percentage) } }));

    for (int i = 0; i < toc.Entries.Count; i++)
    {
        var entry = toc.Entries[i] as Paragraph;
        var hyperlink = entry.Inlines[0] as Hyperlink;

        var page = hyperlink.DisplayInlines[2].Clone(true) as Field;

        var name = hyperlink.Clone(true);
        name.DisplayInlines.RemoveAt(2); // Remove "PageRef" field.
        name.DisplayInlines.RemoveAt(1); // Remove tab.

        table.Rows.Add(new TableRow(document,
            new TableCell(document, new Paragraph(document, $"{i + 1}")),
            new TableCell(document, new Paragraph(document, name)),
            new TableCell(document, new Paragraph(document, page))));
    }

    var content = toc.Content.Set(table.Content);

    var index = new Paragraph(document, "Index");
    index.ParagraphFormat.Alignment = HorizontalAlignment.Center;
    ((Run)index.Inlines[0]).CharacterFormat.Size = 20;

    content.Start.InsertRange(index.Content);
}

Does this solve your issue?

Regards,
Mario

That did the trick. Thank you for your time Mario.

Regards,
Yogesh

Hi Mario,

We are going ahead with the solution you provided and it is working good for us till now.
But facing one issue while cloning the page of the Editable Pdf file.

var newPage = pdfDocument.Pages.AddClone(page);

Above line gives object reference error when we try to merge the Editable pdf file.
I am attaching the sample editable file which I am using.

Here is the sample file

Any help on this will be appreciated.

Regards,
Yogesh

Hi,

Please try again with this bugfix:
https://www.gemboxsoftware.com/pdf/nightlybuilds/GBA15v1103.zip

Or this package:
Install-Package GemBox.Pdf -Version 15.0.1103-hotfix

Does this solve your issue?

Regards,
Mario