

It’s a curse because it’s used for things other than what it’s intended to. It’s doing a good job representing printed material, but unfortunately people very commonly expect it to be something more akin to a word processor file.
It’s a curse because it’s used for things other than what it’s intended to. It’s doing a good job representing printed material, but unfortunately people very commonly expect it to be something more akin to a word processor file.
I know the pain. While there are definitely solutions that work sometimes, there’s just no “one size fits all” that I’m aware of. PDFs can represent text very differently internally.
What I did for one project where extracting the text produced a complete mess was to convert the PDF pages to images and then OCR them…
Hate? Digital decluttering feels really good, for me anyway.
To my knowledge it’s not supposed to differ.
If you trust that the client (which is open source) is doing what it’s supposed to do, security-wise I don’t think there’s a difference between self-hosting and using Bitwarden’s service.
No, you don’t need to trust the VPS provider. The VaultaWarden password storage is encrypted, and the master password is never transmitted to the server. The passwords are decrypted only locally on your device.
For files I just use WebDAV that’s built in to Apache. It’s really not fancy, but does all I need.
I use RoundCube, I think it’s one of the oldest solutions out there, and is pretty good (and not ugly as of a few years ago).
Interesting, I’ll keep it in mind next time I have to deal with this problem (hopefully never but who knows).
A few years ago I was in contact with researchers that were developing an AI tool to parse PDFs (I think they didn’t care about converting to editable formats, but extracting data), from their material I got the impression that it’s extremely difficult to do right using traditional algorithms.