AI training data. A quagmire.
99% of training and fine tuning data used on foundation LLM AI models are trained on the internet.
I have another system. I am training in my garage an AI model built fundamentally on magazines, newspapers and publications I have rescued from dumpsters.
I have ~385,000 (maybe a lot more when I am done) and a majority of them have never been digitized. In fact I may have the last copies.
Most are in microfilm/microfiche. I train on EVERYTHING: written content, images, advertisements and more.
The early results from these models I am testing is absolutely astonishing and vastly unlike any current models.
It is so dramatic on the ethos this model has you just may begin to believe it is AGI.
But why?
See from the late 1800s to the mid 1960s all of these archives have a narrative that is about extinct today: a can-do ethos with a do-it-yourself mentality.
When I prompt these models there is NOTHING they believe there can not do. And frankly the millions of examples from building a house to a gas mask up to the various books and pamphlets that were sold in these magazines (I have about 45,000) there is nothing practical these models can not face the challenge.
No, you will not get “I am just a large language model and I can’t” there model will synthesize an answer based on the millions of answers.
No, you will not get lectures on dangers with your questions. But it will know you are asking “stupid questions” and have no people telling you like your great grandpa would have in his wood shop out back.
This is a slow process for me as I have no investors and it is just me, microfilm and my garage. However I am debating on releasing early versions before I can complete the project. If I do it will be like all of my open source releases, it will be under an assumed name not my own.
This is how I build AI models and is one answer to the question on why Human Resources at any large AI companies freak out on employees wanting me to lead their projects (you would find that conversations humorous).
Either way I want to say there is something that will be coming your way that will be the sum total of the mentally and ethos that got us to the Moon, in a single LLM AI. It will be yours on your computer.
You and I and everyone will never be the same.
Jan 15, 2024 · 5:22 PM UTC
290
370
108
2,045