Learn how to use some commands to create and process PDF files using a Terminal.

Table of Contents

GhostScript

With GhostScript you can make several processing tasks to a PDF file, like compressing.

Following command may seem a bit complex, but this is useful to understand how to use every parameter.

gs \
-dNOPAUSE \
-dQUIET \
-dBATCH \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/printer \
-dAutoFilterColorImages=false \
-dAutoFilterGrayImages=false \
-dDownsampleColorImages=true \
-dDownsampleGrayImages=true \
-dDownsampleMonoImages=true \
-dColorImageResolution=150 \
-dGrayImageResolution=150 \
-dMonoImageResolution=150 \
-dPrinted=false \
-sOutputFile=output.pdf \
input.pdf
  • -dNOPAUSE -dQUIET -dBATCH: By default, gs will show every page of the PDF file and process it, one by one, with a manual confirmation between pages. -dNOPAUSE eliminates the manual confirmation and -dBATCH automatically close gs after the process. -dQUIET hides visual output of the process (equivalent to -q).
  • -sDEVICE=pdfwrite: this specifies output file format. There are several options: “pdfwrite”, “ps2write”, “png16m”, “jpeg”, etc.
  • -dPDFSETTINGS=/printer: these are predefined templates for processing a PDF. Allowed values are (from worse quality to better): “/screen”, “/ebook”, “/printer” and “/prepress” (More info). There is also a “/default” template. You can overwrite individual settings, and this is what I do with the following parameters.
  • -dAutoFilterColorImages=false -dAutoFilterGrayImages=false: I’m not sure what kind of filtering does this parameter, but setting it to “false” makes output to weigh less.
  • -dDownsampleColorImages=true -dDownsampleGrayImages=true -dDownsampleMonoImages=true: this allows to reduce image resolution below the current level.
  • -dColorImageResolution=150 -dGrayImageResolution=150 -dMonoImageResolution=150: this sets image resolution in DPI (dots per inch).
  • -dPrinted=false: if this parameter is equal to “true”, means output will be printed and therefore is not necessary to keep hyperlinks.
  • -sOutputFile=output.pdf: in this parameter you type filename of the output.

Some examples

# Create a preview (a PDF with the first pages)
gs -dNOPAUSE -dBATCH -dQUIET -sDEVICE=pdfwrite -dFirstPage=1 -dLastPage=5 -sOutputFile=preview.pdf input.pdf
# Export a single-page PDF as a JPEG image (quality: 80%)
gs -dNOPAUSE -dBATCH -q -sDEVICE=jpeg -dJPEGQ=80 -sOutputFile=image.jpg input.pdf
# Compress a PDF using the 'ebook' template
gs -dNOPAUSE -dBATCH -q -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook -sOutputFile=output.pdf input.pdf
# Convert a color PDF to B&W
gs -dNOPAUSE -dBATCH -q -sDEVICE=pdfwrite -sColorConversionStrategy=Gray -dProcessColorModel=/DeviceGray -sOutputFile=input_bw.pdf input.pdf

Note (for scripts): If you insert a GhostScript command into a variable and you get errors when using filenames with spaces, try transforming the filename to add escape characters because scripts don’t handle escape characters correctly (check Bash syntax for more info about the syntax), and use sh -c to run GhostScript (with double quotes around). For example, a file named ‘pdf-compress.sh’:

#!/bin/bash
COMMAND="gs -dNOPAUSE -dBATCH -q -sDEVICE=pdfwrite "
# INPUT and OUTPUT are filenames
INPUT=${1// /\\ }
INPUT=${2// /\\ }
COMMAND=${COMMAND}"-sOutputFilename=$OUTPUT $INPUT"
sh -c "$COMMAND"

You can run the script as usual (escaping spaces with \):

pdf-compress.sh input\ with\ spaces.pdf output\ with\ spaces.pdf

LibreOffice

Using LibreOffice with the command line is not very common, but it’s useful if you want to use a script to convert a text file to PDF.

# This will use default settings for PDF export
soffice --convert-to pdf --outdir /output-folder input.docx

GraphicsMagick

This program allows you to convert one or several images to PDF. It’s as simple as this:

gm convert image1.png image2.png file.pdf

Poppler

See Poppler: command-line PDF tools.

pdftk

This tool for manipulating PDF files can do a lot of things. pdftk syntax is simple:

pdftk <input file> <operation> output <output file> [<other parameters>]

These are some of the available ‘operations’:

  • cat <page-range>: use it to merge, split or rotate pages.
    # Remove first page
    pdftk input.pdf cat 2-end output out.pdf
    
    # Select odd pages within a range
    pdftk input.pdf cat 3-27odd output out.pdf
    
    # Two ranges
    pdftk input.pdf cat 2-5 7-9 output out.pdf
    
    # Rotate clockwise (90 degrees)
    # Page rotation can be north: 0, east: 90, south: 180, west: 270
    pdftk input.pdf cat 1-endeast output out.pdf
    
  • background and stamp: add a watermark.

After adding the output filename, you can add some additional parameters to modify the file:

  • flatten: flatten a PDF form.
  • user_pw PROMPT: encrypt the file with a password.

See my other pdftk posts for Encrypting PDFs, How to flatten PDF forms to avoid compatibility errors and How to add a watermark to your multimedia files.

OCRmyPDF

Tool to add an OCR layer into a PDF. Check my post: Add an OCR layer to a PDF with Tesseract and ocrmypdf.

Test with this online terminal:

If you have any suggestion, feel free to contact me via social media or email.