DIY_GIT_in_Python/how_to/Change_01.md

4.7 KiB
Raw Blame History

  • ugit: DIY Git in Python

Welcome aboard! Were going to implement Git in Python to learn more about how Git works on the inside.

This tutorial is different from most Git internals tutorials because were not going to talk about Git only with words but also with code! Were going to write in Python as we go.

This is not a tutorial on using Git! To follow along I advise that you have working knowledge of Git. If youre a newcomer to Git, this tutorial is probably not the best place to start your Git journey. I suggest coming back here after youve used Git a bit and youre comfortable with making commits, branching, merging, pushing and pulling.

  • Why learn Git internals?

For most tools that we use daily, we dont really care about their internals. We can use Firefox or Vim without understanding their inner workings.

At first you shouldnt care about Git internals either. You can use Git as a set of CLI commands that track code history. Run git add, git commit and git push all day long and youll do fine, as long as youre a sole developer who just commits to one branch.

But once you start collaborating with multiple people on multiple branches and things like rebase or force push are getting involved, its easy to become lost if you dont have a good mental model of Git internals.

From my experience with using Git myself and teaching others, a better way to improve your effectiveness with Git is by understanding how it works behind the scenes and not by learning more “advanced” Git commands. This understanding is what will allow you to solve the kind of problems that multi-user collaborative coding sometimes produce.

  • Introducing: μgit

μgit (ugit) is a small implementation of a Git-like version control system (VCS). Its top goal is simplicity and educational value. ugit is implemented in small incremental steps, with each step explained in detail. Hopefully you will be able to read the small steps (explanation and code) and slowly build a complete picture of the internals.

ugit is not exactly Git, but it shares the important ideas of Git. ugit is way shorter and doesnt implement irrelevant features. For example, to reduce the complexity of ugit, ugit doesnt compress objects, doesnt save the mode of the files or doesnt save the time of a commit. But the important ideas, like commits, branches, the index, merges and remotes are all present and are very similar to Git. If you know ugit well you will be able to recognize the same ideas in Git.

This tutorial organized as a series of code changes, each change contains an explanation and the diff of the change. For example, youre now reading the first change, and you can see the code that weve added in this change as a diff on the other side. The code is an empty Python application that prints “hello world”.

In more detail, we created a setup.py file that describes the ugit executable. The executable calls the main() function in cli.py once invoked.

I also recommend to download the source (or type it yourself) in order to follow along and try the ideas in practice. The source for ugit is hosted in a Git repository and the command to download it can be found in the other window. If you want to run the code, I recommend installing ugit in development mode. Run the following command in the root directory of the project:

$ python3 setup.py develop user

Installing in development mode creates a link to our source code instead of copying it to the installation directory, so we can still edit the source and run it immediately.

Now we can run ugit and see “Hello, World!” printed out.

To go to the next change, please press the “Next” button or use the right arrow key.

  • Why learn Git using code?

As I mentioned earlier, in this tutorial we will actually implement Git in Python. I believe that for programmers, seeing the concepts implemented in code crystallizes understanding. Its cool to see Git explained in a diagram, but when you see the same concepts in live code that you can fully understand and actually run, a deeper understanding can be achieved. Thats because if the code works no details can be omitted from it, unlike an explanation with words.

  • Why not learn Git by reading the real Git code?

The real Git code is too complicated to be useful for learning basic concepts with ease. It is production quality code that is optimized for speed. It is written in C. It implements so many advanced Git features. It deals with a lot of edge cases that we dont care about for learning. In this tutorial we will focus on the bare minimum to get the point across.

  • About Me

Hi, Im Nikita and this is a tutorial Ive been working on for a long time. If you have any questions or suggestions, please leave a comment on any of the relevant sections.