These are some ideas for things that I think would be interesting for people to work on that I wrote up in 2022. Some of them are kind of outdated now, but some of them might still hold water. If you're working on any of these things, I'd love to hear from you!
- AlphaZero for math, theorem proving with no human data
- in 2015, Go was too hard of a problem for then-current deep RL methods, so we were forced to invent MCTS to solve it
- today, theorem proving is in a similar place, so it's a good problem to sharpen our methods against
- AI characters that authors can put months/years of work into
- right now, the only input you provide to an AI character is a prompt, which just limits how interesting the characters can end up
- you should be able to put way more information into them, either via fine-tuning or just longer contexts
- maybe they should be goal-directed with RL, trying to get you to say a certain thing or to move the story in a certain direction
- really these are just a new form of literature
- voice to voice models that think in sound
- right now people are doing voice with a whisper->gpt4->elevenlabs pipeline
- really it should be one big end to end model, so that the part that does the thinking knows how you said something
- you want it to be able to interrupt you, talk over you, etc.
- write a history book automatically
- it feels like language models are almost at the point that they could do the research, organize it, and put it into book form
- it's possible you want to fine-tune the whole process on actual history books with RL, so it learns how to do the latent research behind existing books
- easier if you have models with super long context windows
- language models with true billion token context windows
- feels like it's probably possible
- you need to store 1B keys and values in GPU memory, this is about 16TB with hidden dim 4096, easily fits if you're training on 2048 A100s
- then you just need to figure out a way to select which ones to attend over for any given token that doesn't cost too much computation
- also need datasets with sequences 1B tokens long (a book is maybe 200k tokens)
- could construct these sequences by concatenating a bunch of related documents together, so the model has an incentive to find the parts of the context that are most useful for its current prediction
- end-to-end chip design
- basically an optimization problem, just one that's sort of difficult to write down
- you want to be able to do it as one big RL task, where the model is just trying to make some distribution of programs run as fast as possible
- might even want it to output a gate layout directly
- general purpose robotics
- robotics is getting very good!
- this was trained in simulation and then transfered zero shot to the real world
- general approach might be to put a ton of effort into making a super realistic simulator with thousands of different pretraining tasks, and scaling up a lot
- you might also need to make thousands of physical robots to train on if the simulator isn't good enough
- may help to pretrain on the internet, probably language and video
- robotics is getting very good!
- AGI self driving
- if we want self driving cars to be able to go anywhere in any conditions, we need them to be able to generalize
- probably the thing to do is to pretrain on giant text and video datasets, and then specialize them to driving
- people who work on self driving say you'll never get the level of reliability you need that way, but I think you probably will, someone just needs to try it
- (I don't think Cruise/Waymo are doing this, so it could be a startup)
- a big predictive medical model
- train on all the sequences of patient records, scans, tests, treatments, and outcomes
- this is effectively a very big offline RL dataset
- then like a decision transformer, you can predict which treatment will lead to the best outcome for any given patient
- the hardest part is probably getting access to all the medical records, but if you can coordinate everyone, you probably do a lot better than human doctors
- train on all the sequences of patient records, scans, tests, treatments, and outcomes
- are there methods that look very different from deep learning but have the same general shape?
- neural networks are basically just a class of universal function approximators plus a way of optimizing them (SGD)
- the basic building block is just alternating layers of linear and nonlinear functions
- you could imagine applying SGD to different classes of differentiable functions that don't look like MLPs
- evolutionary algorithms are maybe another class of optimizer, but they're usually pretty inefficient
- you would think there would be lots of classes of functions and optimizers that can scale, and neural networks aren't necessarily the most compute efficient