Community Inference - For Good or Bad
There is a surplus of great models that are hostable locally, but not many members of the community have infinite VRAM. Most top out somewhere in 128G. What if we could pool our resources together and host models for each other?