The power of continuous learning

During my first 2.5 years at OpenAI, I worked on the Robotics team on a moonshot idea: we wanted to teach a single, human-like robot hand to solve Rubik’s cube. It was a tremendously exciting, challenging, and emotional experience. We solved the challenge with deep reinforcement learning (RL), crazy amounts of domain randomization, and no real-world training data. More importantly, we conquered the challenge as a team.

From simulation and RL training to vision perception and hardware firmware, we collaborated so closely and cohesively. It was an amazing experiment and during that time, I often thought of Steve Jobs’ reality distortion field: when you believe in something so strongly and keep on pushing it so persistently, somehow you can make the impossible possible.

Since the beginning of 2021, I started leading the Applied AI Research team. Managing a team presents a different set of challenges and requires working style changes. I’m most proud of several projects related to language model safety within Applied AI:

  1. We designed and constructed a set of evaluation data and tasks to assess the tendency of pre-trained language models to generate hateful, sexual, or violent content.
  2. We created a detailed taxonomy and built a strong classifier to detect unwanted content as well as the reason why the content is inappropriate.
  3. We are working on various techniques to make the model less likely to generate unsafe outputs.

As the Applied AI team is practicing the best way to deploy cutting-edge AI techniques, such as large pre-trained language models, we see how powerful and useful they are for real-world tasks. We are also aware of the importance of safely deploying the techniques, as emphasized in our Charter.

#power #continuous #learning

Leave a Comment