GitHub Copilot: A Revolution in tech world or AI Takeover?

 

GitHub recently announced Copilot, an AI-powered pair programmer designed to help developers write code faster and with less effort. The service learns from comments and existing code, suggesting new lines and the implementation of whole functions.




Powered by Codex, the AI system created by OpenAI, Copilot works with different frameworks and languages. Nat Friedman, CEO of GitHub, suggests that the technical preview works better with Python, JavaScript, TypeScript, Ruby, and Go, but it is designed to understand other programming languages too.

A GitHub Copilot implementation of a sortByKey function in Python.

The Visual Studio Code sends comments and code typed by the developer to the GitHub Copilot service, which synthesizes and suggests the implementation. According to GitHub, the service is optimized for small functions with meaningful names for parameters, as for the sortByKey example above.


It learns and adapts to the user’s coding habits, analyzes the available codebase, and generates suggestions backed by billions of lines of public code it has been trained on.

It is the first taste of Microsoft’s $1 billion investment into OpenAI — a research and development company specializing in AGI (Artificial General Intelligence).

How does it work?

At its core, GitHub Copilot uses a new language-generating algorithm by OpenAI called Codex. OpenAI CEO Greg Brockman describes it as the descendant of GPT-3 but narrowly focused on code generation.

Codex algorithm has been trained on terabytes of public code pulled from GitHub and a selection of English language. This gives the tool the ability to write context-based code with unprecedented accuracy.



Currently, it is a Visual Studio Code extension, and the spots for technical preview are limited.

Under the hood, the extension sends your code and comments to the GitHub Copilot services, which use the Codex algorithm to synthesize and create suggestions.

It speaks virtually any programming language but works best with Python, JavaScript, TypeScript, Ruby, and Go.

According to its selection of users, the tool can generate up to 10 alternatives for a single suggestion. The algorithm consistently improves by recording whether each suggestion is accepted or not.

How good is it?

Developers at GitHub conducted an experiment to measure the tool’s accuracy. It was tested on a set of Python functions that have good test convergence in open-source repositories.

All the function bodies have been deleted, and the only context provided was the function names and docstrings. Copilot was able to fill them in correctly 43% of the time, and the accuracy increased to 57% after 10 attempts.

This is truly a remarkable feat since the tool generates working code you can use in your projects. Given that millions will use it once it goes public, it will significantly speed up development.

To get the most out of it, it is suggested to divide the code into small functions, provide meaningful function names, parameters, and docstrings.

In other words, it truly is your pair programmer — it makes you follow software engineering best practices, and in return, it learns from your code to improve its suggestions.

What is the quality of the generated code?

Even though Copilot is correct almost half the time, the creators say its output should be monitored:

The quality of the suggestions also depends on the existing code. Copilot can only use the current file as a context. It cannot test its own code meaning it may not even run or compile.

Besides, the FAQs say you use the tool at your own risk since it can suggest old or deprecated versions of libraries and frameworks.

There are also concerns around Copilot’s training set that contain the code written by millions.

The obvious question is, “Does it ever repeat the code from the training data” to which the answer is yes. It has been observed that there is a 0.1% chance the tool leaks code from the training set.

Framing this probability as a percentage might not show the bigger picture. One in a 1000 chance is actually more serious considering that there will be up to 10 options for each suggestion.

However, an in-depth case study on this problem showed these instances only happen when there is insufficient context to learn from. In particular, the algorithm was more likely to make these mistakes when the current file was short or empty.









4 comments:

  1. Each and every information is very important to understand..

    ReplyDelete
  2. Can I say what a relief to get somebody who really knows what theyre talking about online. You actually know how to bring a problem to light and work out it essential. Workout . must check out this and understand why side from the story. I cant think youre no more common since you also undoubtedly possess the gift. best iPhone cases

    ReplyDelete
  3. I read your blog. I really appreciate your thoughts. Thanks for sharing this article with us. Looing forward for reading more... Website Development Dubai

    ReplyDelete

Powered by Blogger.