Looks like an interesting project. The thing is, I don't think ideal qualities like "reliable, interpretable, and steerable" can really be simply added "on top of" existing deep learning systems and methods.
Much is made of GPT-3's ability to sometimes do logic or even arithmetic. But that ability is unreliable and even more spread through the whole giant model. Extracting a particular piece of specifically logical reasoning from the model is hard problem. You can do it - N-times the cost of the model. And in general, you can add extras to the basic functionality of deep neural nets (few-shot, generational, etc) but with a cost of, again, N-times the base (plus decreased reliability). But the "full" qualities mentioned initially would many-many extras-equivalent to one-shot and need to have them happen on the fly. (And one-shot is fairly easy seeming. Take a system that recognizes images by label ("red", "vehicle", etc). Show it thing X - it uses the categories thing X activates to decide whether other things are similar to thing X. Simple but there's still lots of tuning to do here).
Just to emphasize, I think they'll need something extra in the basic approach.
Looking at the team seems to be all ex-openai employees and one of the cofounders worked on building gpt3. Will be exciting to see what they are working on and if it will be similar work to openai but more commercialized.
I can't find any mention of who currently comprises the core research team. It mentions Dario Amodei as CEO, and their listed prior work suggests some others from OpenAI may be tagging along. However, the success of this group is going to be highly dependent on the caliber of the research team, and I was hoping to see at least a few prominent researchers listed. I believe OpenAI launched with four or five notable researchers as well as close ties to academia via the AI group at Berkeley. Does anyone have further info on the research team?
Excited for this! While OpenAI has generated plenty of overhyped results (imo as an AI researcher), their focus on large scale empirical research is pretty different from most of the field and had yielded some great discoveries. And with this being started by many of the safety and policy people from OpenAI, I am pretty optimistic for it.
Is it appropriate to ask an on-topic but naive question?
It seems as a layman that safer and more legible AI would come from putting neural networks inside of a more 'old-fashioned AI' framework - layers of control loops, decision trees etc based on the output of the black box AI. So at least if your robot crashes or whatever, the computer vision output is in some legible intermediate form and the motion is tractable to traditional analysis.
This can't be an original thought, it must already be done to a large extent. But I get the impression that the way things are going is to get the trained model to 'drive' the output, for whatever the application is. Can someone with current industry knowledge comment?
This looks quite promising!
Their paper "Concrete problems in AI safety"[1] is interesting. Could be more concrete. They're run into the "common sense" problem, which I sometimes define, for robots, as "getting through the next 30 seconds without screwing up". They're trying to address it by playing with the weighting in goal functions for machine learning.
They write "Yet intuitively it seems like it should often be possible to predict which actions are dangerous and explore in a way that avoids them, even when we don’t have that much information about the environment." For humans, yes. None of the tweaks on machine learning they suggest do that, though. If your constraints are in the objective function, the objective function needs to contain the model of "don't do that". Which means you've just moved the common sense problem to the objective function.
Important problem to work on, even though nobody has made much progress on it in decades.
[1] https://arxiv.org/pdf/1606.06565.pdf