LoginLogin
Might make SBS readonly: thread

Kland mistake

Root / Site Discussion / [.]

haloopdyCreated:
Hello all, it's Random (site owner etc). I'm hoping those affected will see this, since Kland was always an SBS thing. If you don't know what Kland is, don't worry, you're not affected.

The problem

I made a huge mistake in migrating Kland recently, and accidentally exposed the master list of all images for 12 hours. In that time, some bots managed to find the master list and began crawling them slowly. I caught it before too many images were pulled (less than 200, I think closer to 100), and I don't think any of them were particularly private.

The solution

I took kland offline, rehashed all the images with a larger hash, and put it back up. I monitored the traffic, and the bots are still trying to find the old hashes, none of which exist (except for the public bucket, which was always public). So, the "exposed images" problem is now fixed.

How it affects you

Old links to images from private buckets will no longer function. If you run across these broken links, you can still get access to the original image by visiting ANY bucket other than the default (you can use "test" if you like) and using the very obvious, angrily orange form at the top of the page to lookup the new link. It works in any bucket, not just the bucket the image came from. Furthermore, if you know the names of your old buckets or have links to them, you can still browse your old images normally.

The severity

Statistically, it's less than 0.2% of actually private images. Furthermore, based on the access pattern, chances are extremely high that these were 'behaving bots' from reputable sources that will delete the images once they load the updated robots.txt. I'm thinking that this was crawled through our shortlink service shsbs.xyz, which had a misconfigured robots.txt that allowed crawling. This wasn't a problem in the past, since shsbs.xyz has no master list and no index, it ONLY services image links. However, in migrating kland, I accidentally let the webservice give the list of files if you access the index of shsbs.xyz, meaning behaving webcrawlers visiting shsbs.xyz were greeted with links to every image. Looking at the access patterns, which are VERY slow, very methodical, and exceedingly polite, it was almost certainly just normal web crawlers indexing shsbs.xyz. After all, in 12 hours, they only scanned less than 200 images. With the updated robots.txt, AND with nearly every link they're requesting giving a 404 now, shsbs.xyz will fall off search results, if it ever even made it onto them.

Final thoughts

I sincerely hope that this change doesn't affect anyone negatively. I figure nobody is really looking at kland images anywhere, even where linked, since it's all so old. And most links to kland come from the public bucket anyway, which is unaffected. If you're concerned about future images getting discovered and you know the bucket name, I can delete the images off the server if you want, however please note that (a) the 100-200 images that were already crawled are crawled, there's nothing we can do about it (b) the rest of the images have entirely new links, and the crawlers have already basically given up at this point. They're tentatively checking 2 or 3 images every hour or so, getting nothing but 404s, and I bet they'll stop eventually. If you forgot where kland is, I don't want to post the direct link here since this website gets crawled frequently, but "kland" is the subdomain and the path is "image", I'm sure you can figure it out.