Here are rough notes (to self) on the copilot story. Source
- knew that they wanted to build something using GPT-3
- started prototyping: demo’s were fabulous
- demo being good is not a sufficient condition
- models were not good enough for chat interface - 25% answer that i love, 75% it was garbage
- code synthesis - synthesizing large function calls - not that satisfying
- small scale autocomplete with the large models -intellisense dropdown UI
- UI was not the right thing
- User would get multiple options for the function body - read and pick the right one
- use the human feedback to improve the model
- reasons this was bad
- hit a key to request it
- wait for it to come back
- read three functions and click the right one - too much cognitive effort
- result was that none of them were good or you didn’t know
- lots of effort on the user but not a lot coming out of it
- Alex said to use the cursor position in the AST to figure out where you are in the code
- if you are at the beginning, complete the whole block
- if you are in the middle, just complete one line
- automatically generated with no user interaction
- model was small enough to be low-latency but big enough to be accurate
- only once all of these pieces were in place did the median new user loved copilot
- other dead-ends too along the way.
- it was obvious it was good - 100’s of users who were github engineers
- retention numbers
- 60%+ on 30 days
- very intrusive product