LightlyTrain is the first PyTorch framework to pretrain computer vision models on unlabeled data for industrial applications - lightly-ai/lightly-train
Really cool to see more tooling making self-supervised learning usable on real-world datasets. Domain shift is a recurring pain, especially when labels are limited—so being able to pretrain directly on unlabeled data is a big deal. Also great to see it open-sourced under AGPL. Have you tried LightlyTrain on any more niche domains, like satellite or industrial inspection data? Would be interesting to see how it performs outside the usual benchmarks. Nice work!
Thanks for the kind words, joelio182! Glad you see the value in making SSL more practical for real-world domain shift issues.
As liopeer mentioned, we have results for medical (DeepLesion) and agriculture (DeepWeeds) in the blog post. We haven't published specific benchmarks on satellite or industrial inspection data yet, but those are definitely the kinds of niche domains where pretraining on specific unlabeled data should yield significant benefits. We're keen to explore more areas like these.
Our goal is exactly what you pointed out - bridging the gap between SSL research and practical application where labels are scarce. Appreciate the encouragement!
We tried it on medical data and agricultural data as mentioned in our launch post:
https://www.lightly.ai/blog/introducing-lightly-train
But we will keep on exploring more domains and downstream tasks, you can be certain of that!
Computer Vision pretraining for the masses!
Finally a production-ready framework for pretraining!