Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

A quick tutorial for getting Alteryx data written to an Adobe Acrobat / PDF form

PeterA
Alteryx Alumni (Retired)

 

Filling out PDF forms is often a necessary step in the analytic processes.  Here's is a sample use-case and practical guide that will help you get your analytics data into your PDF forms.  While there currently is no out-of-the-box / pre-built tool for PDF forms in 2019.2, the steps below will show how easy it is to create your own PDF form filling macro with a single python tool.

 

USE CASE / BACKGROUND:

Our PDF form filling scenario will be a workforce analytics automation scenario that is all about eliminating those mundane and time-consuming processes associated with the creation of employee vesting statements. Here we will showcase how the Alteryx Analytic platform will help streamline and control the equity decisions within an organization's guidelines while delivering transparency and auditability.

 

TYPICAL BUSINESS ENVIRONMENT & OPERATIONAL CHALLENGES:
 - Management directive to simplify and streamline the development and allocation of equity awards
 - Deliver an enhanced view of equity programs with improved levels of transparency, control and auditability across the compensation planning process.
 - Reduce the dependency on legacy mail merge approaches that have proved too time-consuming and lacked the requisite level of control.

 

 

The PDF Form - Employee Vesting Template:

In this scenario our HR / employee vesting template was created in Microsoft Word & Adobe Acrobat DC and adheres to the 1.7 PDF standard.

PDF Form / Template for Vesting StatementsPDF Form / Template for Vesting Statements

 

 

Preparing the Form:

An important step is making sure that the Alteryx Field/Column name *IS EQUIVALENT*  to the form field name in the Adobe PDF template.

Other notable items:

  * Field order does not matter

  * Visibility Flag is normally set at True for these types of forms.

  * Read-only Flag is Optional.  Our python code will set this flag in the PDF and will effectively eliminate the 'blue boxes'

PDF Form Field PropertiesPDF Form Field Properties

 

 

Preparing the Workflow:

As mentioned above, Alteryx Field/Column name must match the form field name in the Adobe PDF template.  In the below/attached workflow we will alter the data stream to match our form, by creating a new field call fullName.  Here are a couple of additional changes we made along the way:

  * Added an Employee Password (dynamic)

  * Added an HR (owner) Password (static)

  * Specified the fullpath to our PDF template form

  * Specified directory of our output files

  * Cleaned up the look (2 digit decimals)

  * Converted all numeric fields to a string

Main Workflow - Workforce AnalyticsMain Workflow - Workforce Analytics

 

 

 

THE PYTHON BASED BATCH MACRO: 

This is a somewhat universal macro.  It will take in any data stream and attempt to insert those values into the correct PDF fields.  Then it will flatten and encrypt before it moves onto the next record (employee).

Python based batch macro for filling PDF formsPython based batch macro for filling PDF forms

END RESULT:

Here is a copy of the end result:

 

PDF Form filled by AlteryxPDF Form filled by Alteryx

 

 

For those that want to dive into the python code, just expand the spoiler below.

Spoiler
#################################
__author__ = 'Peter Anderson'
__copyright__ = 'Copyright 2019, Alteryx, Inc.'
__credits__ = ['Peter Anderson']
__version__ = '2019.7.1'
__maintainer__ = 'Peter Anderson'
__email__ = 'peter.anderson@alteryx.com'
__status__ = 'Dev'


from ayx import Alteryx, Package
# Installs the Python packages that are not included in the default installation.  
#Package.installPackages(['PyPDF2']) 

from PyPDF2 import PdfFileWriter, PdfFileReader
from PyPDF2.generic import BooleanObject, NameObject, IndirectObject, NumberObject


#################################
''' NEED APPEARANCES FLAG
Setting the NeedAppearances flag to true specifies whether to construct 
appearance streams and appearance dictionaries for all widget annotations
in the document. Default value: false.  This function will change that value  to
to "True."

See 12.7.2 and 7.7.2 for more information:
<a href="<a href="<a href="http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf" target="_blank">http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a>" target="_blank"><a href="http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a" target="_blank">http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a</a>>" target="_blank"><a href="<a href="http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a" target="_blank">http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a</a>" target="_blank"><a href="http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a</a" target="_blank">http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a</a</a>>>
'''
def set_need_appearances_writer(writer):
    try:
        catalog = writer._root_object
        # get the AcroForm tree and add "/NeedAppearances attribute
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)})

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)

    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))
        return writer
    
    
''' ANNOTATION FLAG
Setting to flag to "7" will not allow the annotations to interact with the
user.  The annotation may be displayed or printed (depending on the settings
of the NoView and Print flags) but should not respond to mouse clicks
or change its appearance in response to mouse motions. 

See 12.5.3 for more information:
<a href="<a href="<a href="http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf" target="_blank">http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a>" target="_blank"><a href="http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a" target="_blank">http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a</a>>" target="_blank"><a href="<a href="http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a" target="_blank">http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a</a>" target="_blank"><a href="http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a</a" target="_blank">http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf</a</a</a>>>
'''
def set_annotation_flag_writer(writer, field_dictionary):
    try:
        for j in range(0, len(writer.getPage(0)['/Annots'])):
            writer_annot = writer.getPage(0)['/Annots'][j].getObject()
            for field in field_dictionary:
                if writer_annot.get('/T') == field:
                    writer_annot.update({
                        NameObject("/Ff"): NumberObject(7),
                    })
    except Exception as e:
        print('set_annotation_flag_writer() catch : ', repr(e))
        return


''' FILLS THE PDF FORM
This functions performs 3 action to filling out a PDF form.
* Forces Adobe's NeedAppearances Flag to True (make values visible)
* Forces Adobe's Annotation Flag to "7" which makes the form read-only.
* Datastream values are placed as values in the PDF Form - Incoming 
  column names must match the ACROFORM field names.
'''    
def AYX_PDF_form_fill(template, outfile, ownerPWD, userPWD):
    try:
                
        field_dictionary = df.to_dict('records')[0]
        inputStream = open(template, "rb")
        pdf_reader = PdfFileReader(inputStream, strict=False)

        pdf_writer = PdfFileWriter()
        set_need_appearances_writer(pdf_writer)

        pdf_writer.addPage(pdf_reader.getPage(0))
        pdf_writer.updatePageFormFieldValues(pdf_writer.getPage(0), field_dictionary)

        set_annotation_flag_writer(pdf_writer, field_dictionary)
        
        outputStream = open(outfile, "wb")
        pdf_writer.encrypt(userPWD, ownerPWD, use_128bit=True)
        pdf_writer.write(outputStream)

        inputStream.close()
        outputStream.close()
        
    except Exception as e:
        print('AYX_PDF_form_fill() catch : ', repr(e))
        return
    
    




#################################
df = Alteryx.read("#1")        #Read in Datastream from Alteryx Designer Canvas

template = df.loc[0,'vestingStatementTemplate']
outfile = df.loc[0,'FileName']
ownerPWD = df.loc[0,'ownerPWD']
userPWD = df.loc[0,'userPWD']

AYX_PDF_form_fill(template, outfile, ownerPWD, userPWD)



#################################

 

36 REPLIES 36
apathetichell
19 - Altair

@R_JonesHere's the python package this workflow needs:

 

Package.installPackages(['PyPDF2'])

kelvin_law1
9 - Comet

Hi @PeterA,

 

This is a great post and worked perfect using your sample PDF!!

However, when I tried using my PDF form, the filled in text can only be shown after mouse click on that field, and it will be invisible again if the focus is gone (except checkboxes, which works perfect using your additional code in your reply above).

 

I have read many posts saying that NeedAppearances set to True should resolve the issue, but not in my case.

Do you know what else I could try to fix this?

 

Thanks in advance!!!

 

 

DimitriDross
5 - Atom

Hey, I have the same problem. Were you ever able to resolve this issue?

kelvin_law1
9 - Comet

Hi @DimitriDross 

 

Yes, I got that resolved by using another Python package, "pdfrw", instead of "PyPDF2".

You can refer to this page for more detail on how to use this package.

 

Hope it helps!!

Anna92
6 - Meteoroid

Hi there.

 

I ran the workflow and it ran successfully, but did not generate any PDF outputs. Am I supposed to replace the browse tool with an output tool to do this or should the python code be functioning to do that? Thank you! 

sowmyamayappa
5 - Atom

Hi @peter 
Thanks for the tutorial 
The workflow runs without any errors but it not producing any pdf files. However, just running the macro with template produces the single file. Could you please suggest 

KimLamNg
8 - Asteroid

Were anyone able to get the output pdf files?

calijohnson
5 - Atom

Hi Peter,

 

First of all thank you so much for this! It has been super helpful so far, but I was wondering if you could possibly explain how to adjust the coding for a different pdf than the template you were using. I am not super familiar with Python and have not been able to figure out how to use this macro set up to update my own pdf.

 

Thanks!

taylormgruendel
8 - Asteroid

Hi,

 

Thanks for sharing this macro! I was able to run this to fill out 1 PDF, but when it goes to the second record to fill out the second PDF it saves the PDF but I'm not able to open the file. I'm assuming the file is either damaged or corrupted. Do you know how I could go about fixing this? Thank you!

hcook
6 - Meteoroid

Thanks, Peter! This is great!

 

I am able to run the macro file (vesting_statement_helper_macro.yxmc), but the output looks blank in Adobe. However, if I view the PDF in Microsoft Edge, everything is there. Does anyone know how to get this to appear in Adobe?

Labels
Top Solution Authors