Skip to content

Install paperless-ng: Install paperless

July 14th, 2021 - ittutorial(1 min)

I’ve finished setting up the shares, so it’s time now oto install paperless. The repository has an installation via docker compose and via ansible. I need a combination of the two, with the NAS on top, so I’ve decided to follow this approach to convert the docker compose file to a list of Ansible tasks.

Paperless ahs a number of dependencies:

  1. Redis
  2. Gotenberg - to convert office documents to PDF
  3. Apache tika - Content analysis (aka OCR)
  4. PostgreSQL - Metadata storage

Dependencies

Directories

I’ve learned the hard way that ansible with Docker needs volume directories already prepared. For this, I’ve created a specific task:

- name: Prepare directories
  hosts: app
  tasks:
    - name: Directories
      file:
        state: directory
        path: '{{ item }}'
        mode: 0755
        owner: document
        group: documents
      with_items:
        - /mnt/documents-consume/consume
        - /mnt/documents-consume/export
        - /opt/documents/paperless/media
        - /opt/documents/paperless/data

All these will be used as docker volumes for paperless.

Group install tasks

I’ve grouped all subsequent tasks under a single entry:

- name: Set up network
  hosts: app
  tasks:

Network

I can’t use depends_on in ansible, so I have to emulate the dependencies via a custom network:

- name: Create network
  docker_network:
    name: network_paperless
    ipam_config:
      - subnet: '172.16.99.0/24'

You can type in the last line subnet any internal network you might like.

Redis

Redis installation in docker is simple. With ansible is also simple:

- name: Setup REDIS broker
  docker_container:
    name: broker
    recreate: true
    restart_policy: unless-stopped
    image: 'redis:6.0'
    networks:
      - name: 'network_paperless'

PostgreSQL

Next task in line is PostgreSQL:

- name: Install database
  docker_container:
    name: 'db'
    recreate: true
    restart_policy: unless-stopped
    image: 'postgres:13'
    user: '<doc user id>:<doc user group>'
    volumes:
      - '/mnt/documents/db:/var/lib/postgresql/data'
    env:
      POSTGRES_DB: 'paperless'
      POSTGRES_USER: 'paperless'
      POSTGRES_PASSWORD: '{{ paperless_pg_password }}'
    networks:
      - name: 'network_paperless'

Here, I’ve had to specify the UID:GID of the document user created previously. Otherwise, the NAS mount (where the DB is located) would barf.

Document processing

The document processing is done via Gotenberg and Apache Tika

- name: Install Gotenberg
  docker_container:
    name: 'gotenberg'
    recreate: true
    restart_policy: unless-stopped
    image: 'thecodingmachine/gotenberg'
    networks:
      - name: 'network_paperless'
    env:
      DISABLE_GOOGLE_CHROME: '1'

- name: Install Apache Tika
  docker_container:
    name: 'tika'
    recreate: true
    restart_policy: unless-stopped
    image: 'apache/tika'
    networks:
      - name: 'network_paperless'

PaperlessNG

The final piece is the paperless program.

 - name: Install paperless-ng
    docker_container:
      name: "paperless"
      recreate: true
      restart_policy: unless-stopped
      image: "jonaswinkler/paperless-ng:latest"
      #user: "1005:8675310"
      networks:
        - name: "network_paperless"
      ports:
        - "38000:8000"
      env:
        PAPERLESS_REDIS: "redis://broker:6379"
        PAPERLESS_DBHOST: "db"
        PAPERLESS_DBUSER: "paperless"
        PAPERLESS_DBPASS: "{{ paperless_pg_password }}"
        PAPERLESS_TIKA_ENABLED: "1"
        PAPERLESS_TIKA_GOTENBERG_ENDPOINT: "http://gotenberg:3000"
        PAPERLESS_TIKA_ENDPOINT: "http://tika:9998"
        USERMAP_UID: "<doc user id>"
        USERMAP_GID: "<doc user group>"
        PAPERLESS_ADMIN_USER: "admin"
        PAPERLESS_ADMIN_PASSWORD: "changeme"
      healthcheck:
        test: ["CMD", "curl", "-f", "http://localhost:38000"]
        interval: 30s
        timeout: 10s
        retries: 5
      volumes:
        - "/mnt/documents/paperless/data:/usr/src/paperless/data"
        - "/mnt/documents/paperless/media:/usr/src/paperless/media"
        - "/mnt/documents-consume/consume:/usr/src/paperless/consume"
        - "/mnt/documents-consume/export:/usr/src/paperless/export"

Here, I had to do the same UID:GID specification, so NAS would work nicely.

Notes

Please note that I have paperless_pg_password as a secret, as opposed to the default paperless one.

HTH,

Share on

A little experiment: