I frequently need to combine several PDF files into one large PDF so I don’t have to send a mess of small files via email. Though I have accomplished this task before without much of a problem by issuing the following command to Ghostscript, I decided that my usual method is inefficient:
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=combinedpdf.pdf -dBATCH 1.pdf 2.pdf 3.pdf
Later on I found a terminal-based application called PDFtk that allowed for a more easily remembered command:
pdftk PART1.pdf PART2.pdf PART3.pdf cat output COMBINED.pdf
where one simply replaces the capitalised portions with the appropriate names of the input and output PDF files. However, to use that utility within Gentoo, one has to compile sys-devel/gcc with the gcj USE flag enabled. That USE flag builds GCC with support for the Java Programming Language. While this was not necessarily a big dependency, I didn’t feel like recompiling GCC with support for Java in order to use a terminal utility. Instead, I wanted to use a lightweight GTK GUI application that would allow me to do some basic PDF tasks, and I did so with PDFshuffler.
This application is incredibly minimalistic, easy to use, and it accomplishes a few PDF tasks very nicely. Lifted directly from the project’s SF page, “PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a frontend for python-pyPdf.” I have not used it to rotate any documents, but I have found that it allows me to take care of the other tasks quite effectively and efficiently. Even better, it really doesn’t have many dependencies not already on my system.
I have filed a stablisation request (STABLEREQ) for this application as well as its two explicitly-listed dependencies (python-poppler and pyPDF) in the Gentoo bugzilla. If you use this application or either of its dependencies, please comment on your experiences, especially regarding runtime stability.
I hope that some of you find this application to be as helpful as I have. 🙂
Take care for now,
Zach
13 comments
Skip to comment form
There is also pspdftool on sourceforge. It should have minimal dependencies as well.
@Michael,
I have not, but I would be interested in seeing it in action. I am trying to find a list of the dependencies to see whether or not it will pull in many components of GNOME. I like my lightweight Openbox setup. 😉
Thank you for the recommendation!
Have you tried pdfmod (http://live.gnome.org/PdfMod)? It is available in suka overlay.
@Andre,
PDFjam is another good choice, and I believe it uses LaTeX if I’m not mistaken. Thanks for mentioning it.
@Toralf,
I mentioned a strange error in the bug report, but I didn’t have those ones in particular. You may want to add them to the report:
http://bugs.gentoo.org/show_bug.cgi?id=295393
@jkt,
Yup, pdfjoin (as part of the pdfjam package will do the trick as well). Thanks for bringing it to my attention.
@Karl,
PDFsam is also written in Java, and that doesn’t work for my particular needs, but thank you for mentioning it here so that others may readily find it.
@luke123,
Thanks for the shorter, more efficient command using pdftk. That will work nicely if the PDF files are all numbered accordingly.
pdftk PART[1-3].pdf cat output COMBINED.pdf
Have a look at pdfsam. It works great.
What about pdfjoin from app-text/pdfjam?
nice tool, works fine here too at x86, althought some more warnings were shown :
tfoerste@n22 /mnt/E/my/kochen $ pdfshuffler
/usr/lib/python2.6/site-packages/pyPdf/pdf.py:52: DeprecationWarning: the sets module is deprecated
from sets import ImmutableSet
/usr/lib/python2.6/site-packages/pyPdf/generic.py:406: DeprecationWarning: object.__init__() takes no parameters
str.__init__(self, data)
/usr/lib/python2.6/site-packages/pyPdf/generic.py:216: DeprecationWarning: object.__init__() takes no parameters
int.__init__(self, value)
(u’exporting to:’, ‘/mnt/E/my/kochen/sdf.pdf’)
/usr/lib/python2.6/site-packages/pyPdf/pdf.py:163: DeprecationWarning: the md5 module is deprecated; use hashlib instead
import struct, md5
There’s also pdfjam for the task,
and I am very happy with it.
Excellent! While PDFtk didn’t work for my needs, I hope it meets yours. Take care.
–Zach
Thanks for the report. I will have a look on pdftk.
No problem at all. I hope you find the application useful! 🙂
–Zach
Thanks for the suggestion.
I found gcj dependency for PDFtk cumbersome too.
I will definitely check it out.
Kamil.