Feature Request : Add File from URL #4216

calumk · 2024-01-23T07:22:10Z

Appreciate this is probably low priority.

As part of my workflow, I find myself often downloading images / files only to re-upload them to pocketbase.

It would be nice on the Admin file fields to have the option to "import from url"

inspiration from pocketbase#4216 On start, it creates two collections: - `jobs`: With fields: - `url`: URL to download - `compress`: gzips the downloaded file content, if not already gzip - `downloads` with fields: - `mimetype`: Content-Type header, or guessed - `filename`: Parsed from Content-Disposition, or using the last part of the URL path + an optional extension from the mime type (if no extension) - `content`: Downloaded file content - `downloaded`: Timestamp of when the file was downloaded - `retries`: Number of download attempts - `hash`: MD5 hash of content - `error`: Error string from the most recent download attempt - `encoding`: File encoding - `size`: File size - `status`: Half-baked 'pending', 'completed', etc indicator Adding an entry to `jobs` will trigger a new entry to be added to `downloads`, and an initial download attempt is made. If it fails, it'll later retry up to CLI flag `--download-max-retries` (default 5). `--download-schedule` accepts a cron schedule (default every 5 minutes), at which point it will attempt to download (or retry) `downloads` entries that don't have content (and which are older than 5 minutes). This is basically a first pass- it doesn't account for edge cases like files that may take longer to download than the scheduled download 'catchup' attempts. The schedule is configurable, but the 'older than 5 minute' retry filter is hardcoded, so changing the schedule might cause issues.

arcward · 2024-02-04T23:13:39Z

@calumk I don't know how well this actually fits your use case, but check out arcward/pocketbase/main/examples/base/main.go . I was poking around this project and your feature request seemed like a good way for me to get a little familiar with the codebase, so I threw it together yesterday. I used the example app in the repo as I'm not quite familiar enough yet to integrate it directly as a feature, rather than an extension. Plagiarizing my own commit message:

On start, it creates two collections:

jobs: With fields:
- url: URL to download
- compress: gzips the downloaded file content, if not already gzip
downloads with fields:
- mimetype: Content-Type header, or guessed
- filename: Parsed from Content-Disposition, or using the last
  part of the URL path + an optional extension from the mime type
  (if no extension)
- content: Downloaded file content (this is the normal file field)
- downloaded: Timestamp of when the file was downloaded
- retries: Number of download attempts
- hash: MD5 hash of content
- error: Error string from the most recent download attempt
- encoding: File encoding
- size: File size
- status: Half-baked 'pending', 'completed', etc indicator

Adding an entry to jobs will trigger a new entry to be added to downloads, and an initial download attempt is made from jobs.url. If it fails, it'll later retry up to CLI flag --download-max-retries (default 5). --download-schedule accepts a cron schedule (default every 5 minutes), at which point it will attempt to download (or retry) downloads entries that don't have content (and which are older than 5 minutes).

This is basically a first pass- it doesn't account for edge cases like files that may take longer to download than the scheduled download 'catchup' attempts. The schedule is configurable, but the 'older than 5 minute' retry filter is hardcoded, so changing the schedule might cause issues. Regardless, I tested it on a few dozen URLs (direct links to files or urls like example.com/docs/) with the default retry+schedule, and with/without compression, and it seems to work pretty well on the 'happy path.'

gedw99 · 2024-05-14T08:28:46Z

This looks really useful .

Is it designed to have a gui available from the admin screen ? Or the cli ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request : Add File from URL #4216

Feature Request : Add File from URL #4216

calumk commented Jan 23, 2024

arcward commented Feb 4, 2024 •

edited

gedw99 commented May 14, 2024

Feature Request : Add File from URL #4216

Feature Request : Add File from URL #4216

Comments

calumk commented Jan 23, 2024

arcward commented Feb 4, 2024 • edited

gedw99 commented May 14, 2024

arcward commented Feb 4, 2024 •

edited