Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid 8 Byte Files, Frequent Hangs of Applications, ERROR - "path/file.something" is not a valid cache file, deleting #56

Open
GeorgiaM-honestly opened this issue Jun 30, 2021 · 1 comment

Comments

@GeorgiaM-honestly
Copy link

Hello,

I love catfs. I'm using it to provide caching for a goofys s3 "mount". It makes a huge difference!

I do however have some nagging issues which happen on a reglar basis. I hope I have provided enough information and would like to help resolve this if it's a non-local specific issue.

System:

  • Gentoo linux
  • uname -a: Linux deleted 5.12.13-gentoo-x86_64 handle ENOSPC #1 SMP Thu Jun 24 12:24:13 DELETED 2021 x86_64 Intel(R) Core(TM) i7-7820X CPU @ 3.60GHz GenuineIntel GNU/Linux
  • 16 cores
  • 32GB RAM
  • Filesystem which the cache is using: 100GB, EXT4-fs (sde1): mounted filesystem with ordered data mode. Opts: errors=remount-ro,user_xattr. Quota mode: disabled.

catfs issues:

  • When viewing the data of a directory full of .tif files, the first one requested is always 8 bytes, which appears to be the bare minimum of data to get 'file' to report it is a tif image.

  • Indeed the first 8 bytes of real source file is the same, however, it is far larger than 8 bytes.

  • When viewing the data of a directory full of .tif files, first request or not, screenful of errors are printed out by catfs. Examples:

2021-06-30 16:53:22 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0902830w295700n.tif" is not a valid cache file, deleting
2021-06-30 16:53:23 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0902830w295830n.tif" is not a valid cache file, deleting
2021-06-30 16:53:23 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0902830w300000n.tif" is not a valid cache file, deleting
2021-06-30 16:53:24 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0902830w300130n.tif" is not a valid cache file, deleting
2021-06-30 16:53:24 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0903000w295530n.tif" is not a valid cache file, deleting
2021-06-30 16:53:25 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0903000w295700n.tif" is not a valid cache file, deleting
2021-06-30 16:53:25 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0903000w295830n.tif" is not a valid cache file, deleting
2021-06-30 16:53:26 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0903000w300000n.tif" is not a valid cache file, deleting

  • During this time, which can be several minutes, the application trying to use the data is hung and usually needs force killed.

  • While catfs is "deleting" files, the total size of the cache directory root grows

  • After these few minutes, everything works, except for that first invalid 8 byte file.

  • Often but not always after this delay, despite having fetched files before, they are fetched new again (increased inbound network traffic is observed)

  • Gwenview STDERR on the 8 byte file:
    gwenview.libtiff: Error JPEGLib "Not a JPEG file: starts with 0xd5 0x7e"
    org.kde.kdegraphics.gwenview.lib: Could not generate thumbnail for file "file:///media/s3-noaa-eri-pds-LOCAL-catfs/2005_Hurricane_Katrina/sep02JpegTiles_GCS_NAD83/sep02C0890130w290900n.tif"
    gwenview.libtiff: Error JPEGLib "Not a JPEG file: starts with 0xd5 0x7e"
    org.kde.kdegraphics.gwenview.lib: Could not generate thumbnail for file "file:///media/s3-noaa-eri-pds-LOCAL-catfs/2005_Hurricane_Katrina/sep02JpegTiles_GCS_NAD83/sep02C0890130w290900n.tif"

  • Sometimes after all the other images have been downloaded and thumbnails created, I can double-click on that first file that had been 8 bytes, and it's now downloaded, and will display without error. Other times, it stays 8 bytes.

  • Reproduction

  • Mount an aws s3 public bucket with goofys: goofys -o allow_other -o ro noaa-eri-pds /media/s3-noaa-eri-pds-LOCAL

  • Start catfs appropriately: catfs -o ro --free 1G /media/s3-noaa-eri-pds-LOCAL /home/myusername/.goofys-cache /media/s3-noaa-eri-pds-LOCAL-catfs

  • Explore the images with a tool such as gwenview that will show thumbnails

  • Optional: Start gwenview on the command line, outputting STDOUT and STDERR to their own files and then watch these files: gwenview 1> /tmp/gwenview-STDOUT.txt 2> /tmp/gwenview-STDERR.txt & tail -f /tmp/gwenview-STD*

@GeorgiaM-honestly
Copy link
Author

GeorgiaM-honestly commented Jul 1, 2021

I noticed I was getting complaints from catfs about too many open files. The default was 1024 for the user, so I increased it to 10240 and rebooted.

I'm not getting the too many open files errors anymore, however, the other issues continue. I was checking how many open files there were for which processes, and here's the result. I am not entirely sure this is the right way to count "open files", however, I think these stats are interesting. Is this normal?

As root: lsof -n | grep myusername | awk '{print $1}' | sort -n | uniq -c | sort -nr | head -n 4

133961 catfs
78938 chrome
49162 thunderbi
16426 gwenview

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant