Fix Copy-and-Pasting in PDFs
Wednesday, September 1, 2010 at 3:48PM
One Hour Programmer

Update (3/30/15): PDF-XChange Viewer is now supported.

Update (10/24/14): I’ve disabled comments due to spam. If you want to contact me, click here.

Update (5/2/11): Major update. See details below.

Update (11/21/2010): Improved version of the script. See details below.

Recently I’ve been reading a lot of ebooks in PDF format so it wasn’t long before I noticed that when you copy text in Adobe Reader (or any other kind of PDF reader like Foxit Reader), it copies the hard returns so if you copy more than one line of text, each line of text is treated like its own paragraph.

 

The Problem

This is what it looks like in Adobe Reader.

This is what happens when you copy that text and paste it into Microsoft Word.

Still good, right? But what if I want to change the font size?

Here I changed the font size of the first two lines to 8-pt. Now you should see the problem quite clearly; in its infinite wisdom, Adobe Reader inserts hard returns instead of soft returns. In other words, Adobe Reader is essentially pressing “Enter/Return” after every line instead of automatically “word wrapping” the text to fit the window size. If this sounds absurdly idiotic, that’s because it is. At first I didn’t know that was the problem until I found this guy’s blog post in a Google search. To give you an idea of how basic “word wrap” is, Notepad has it.

I know I’m not the only one who’s had this problem, and someone even wrote a Word macro. I haven’t tried the macro but it only works with Microsoft Word and it seems overly complicated (it uses well over 100 lines of code and requires a detailed step-by-step installation and usage guide; my script just works out of the box and only uses five nine lines of code (I added support for other PDF readers) [The latest version uses quite a few lines of code, but the vast majority of it is just to add some “nice-to-have” features, especially the automatic quote appending].

The Fix

Download

PDF Copy-Paster.exe (program)

Program Notes

This script strips out all hard returns out of any copied text. Simply keep the program running and it will automatically take out all the hard returns in the background. The program only activates for PDF readers. Currently the program recognizes Adobe Reader (both the standalone program and the browser plugin versions), Foxit Reader, and Sumatra PDF as PDF readers. If your favorite PDF reader isn’t on here, shoot me a quick email and I’ll add it for you.

[Note: The latest version (5/2/11) will close this popup for you automatically] Note that if you press ctrl+c twice quickly, Adobe Reader may pop up a message saying “There was an error while copying to the Clipboard. An internal error occured.” This is likely because Adobe Reader tried to modify the clipboard contents while they were being modified by the script (removing the hard returns). You do not need to copy the text again; everything should work fine.

Update (5/2/11)

I finally made a breakthrough and identified why sometimes the script would stop working and have to be restarted. Turns out it’s an issue with Windows, not anything to do with my code. Anyway, I’ve included a workaround as well as some new features such as an option to automatically add quotes to your copied selections (right click on tray icon to enable) and the ability to left-click on the tray icon to disable or enable the program. The program’s been significantly changed, but most of it is under the hood. The script should now have absolutely no bugs. Of course, if you think you’ve found me, please contact me.

Update (11/21/2010)

I’ve been using this program a lot while writing some big research papers, so I returned to the script to make toggling it more elegant. Now, just press Ctrl+Shift+D to toggle whether the script is active or not. The script’s icon will reflect this change. You no longer need the helper program for this functionality.

Update (9/2/2010)

Helper Program

I wrote this little script for a friend who sometimes wanted to retain the hard returns and sometimes didn’t. It’s not exactly elegant, but it works. Press Ctrl+Shift+d to toggle PDF Copy-Paster. If it’s already running, it will be closed while if it isn’t running, this script will run it.

A few notes

1. This only works for the compiled version of PDF Copy-Paster (the .exe version). Although I am providing a source code version (.ahk), this version also only works with the compiled version of PDF Copy-Paster.

2. This companion script needs to be placed in the same directory as PDF Copy-Paster.exe.

3. I was too lazy to make an icon for this, so it uses the default AHK icons.

Helper Program Download

PDF Copy-Paster Toggler.exe (program)

PDF Copy-Paster Toggler.ahk (source code)

Article originally appeared on One Hour Programming - Programming for the Rest of Us (http://www.onehourprogramming.com/).
See website for complete article licensing information.