Compare commits
61 Commits
71abdf3454
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| c484b41a89 | |||
| fe5ed910a3 | |||
| 9c53919802 | |||
| 30ce8c84e4 | |||
| 6841a97d18 | |||
| 556c16c081 | |||
| 7a0f86e49b | |||
| 3770c81942 | |||
| 9f8fde3c60 | |||
| 772f631768 | |||
| 7fe3e0f497 | |||
| 7896b80c42 | |||
| b854b4fa18 | |||
| 2362d69673 | |||
| d53322c256 | |||
| 7fbf6640f6 | |||
| dad9077515 | |||
| db7d608010 | |||
| c9d8b443ed | |||
| 41333f06bc | |||
| fe292c02c9 | |||
| de595261e6 | |||
| 81bf86d41b | |||
| 671fa4b6b1 | |||
| 63dcbeb9e7 | |||
| edae32dc86 | |||
| e85766f671 | |||
| 1f947e6343 | |||
| cb8e744794 | |||
| 6797bcfabe | |||
| 95355befb4 | |||
| b802e1eb9d | |||
| 817f38f49c | |||
| d00a7817ab | |||
| 8ac5264366 | |||
| cd91f18da6 | |||
| 78044a877a | |||
| b0d8cab498 | |||
| 450391089f | |||
| 1847cfbb17 | |||
| 6a91c03f40 | |||
| 4e13a27f79 | |||
| 2c940abd1d | |||
| c72370f930 | |||
| 40a19615aa | |||
| db8c1379c2 | |||
| 6f5fe864a9 | |||
| 2f8545d48e | |||
| 5faf498917 | |||
| 4540b98a88 | |||
| af1928a360 | |||
| 46a20c8b60 | |||
| fdfcfdbdad | |||
| 73eb89d397 | |||
| d666efcbd3 | |||
| 30ee2098ab | |||
| 103837cb73 | |||
| 2556bde16f | |||
| 36f6f88990 | |||
| a010615cf2 | |||
| 9634544d68 |
24
how_to/Change_05.md
Normal file
24
how_to/Change_05.md
Normal file
@@ -0,0 +1,24 @@
|
|||||||
|
- cat-file: Print hashed objects
|
||||||
|
|
||||||
|
This command is the "opposite" of `hash-object`: it can print an object by its
|
||||||
|
OID. Its implementation just reads the file at ".ugit/objects/{OID}".
|
||||||
|
|
||||||
|
The names `hash-object` and `cat-file` aren't the clearest of names, but they
|
||||||
|
are the names that Git uses so we'll stick to them for consistency.
|
||||||
|
|
||||||
|
We can now try the full cycle:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ cd /tmp/new
|
||||||
|
$ ugit init
|
||||||
|
Initialized empty ugit repository in /tmp/new/.ugit
|
||||||
|
$ echo some file > bla
|
||||||
|
$ ugit hash-object bla
|
||||||
|
0e08b5e8c10abc3e455b75286ba4a1fbd56e18a5
|
||||||
|
$ ugit cat-file 0e08b5e8c10abc3e455b75286ba4a1fbd56e18a5
|
||||||
|
some file
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that the name of the file (bla) wasn't preserved as part of this process,
|
||||||
|
because, again, the object database is just about storing bytes for later
|
||||||
|
retrieval and it doesn't care which filename the bytes came from.
|
||||||
17
how_to/Change_06.md
Normal file
17
how_to/Change_06.md
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
- data: Add types to objects
|
||||||
|
|
||||||
|
As we will soon see, there will be different logical types of objects that are
|
||||||
|
used in different contexts (even though, from the Object Database's point of
|
||||||
|
view, they are just all bytes). In order to lower the chance of using an object
|
||||||
|
in the wrong context we're going to add a type tag for each object.
|
||||||
|
|
||||||
|
The type is just a string that's going to be prepended to the start of the file,
|
||||||
|
followed by a null byte. When reading the file later we'll extract the type and
|
||||||
|
verify that it's indeed the expected type.
|
||||||
|
|
||||||
|
The default type is going to be `blob`, since by default an object is a
|
||||||
|
collection of bytes with no further semantic meaning.
|
||||||
|
|
||||||
|
We can also pass `expected=None` to `get_object()` if we don't want to verify
|
||||||
|
the type. This is useful for the `cat-file` CLI command which is a debug command
|
||||||
|
used for printing all objects.
|
||||||
0
how_to/Change_07.md
Normal file
0
how_to/Change_07.md
Normal file
26
how_to/Change_08.md
Normal file
26
how_to/Change_08.md
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
- write-tree: List files
|
||||||
|
|
||||||
|
The next command is `write-tree`. This command will take the current working
|
||||||
|
directory and store it to the object database. If `hash-object` was for storing
|
||||||
|
an individual file, then `write-tree` is for storing a whole directory.
|
||||||
|
|
||||||
|
Like `hash-object`, `write-tree` is going to give us an OID after it's done and
|
||||||
|
we'll be able to use the OID in order to retrieve the directory at a later time.
|
||||||
|
|
||||||
|
In Git's lingo a "tree" means a directory.
|
||||||
|
|
||||||
|
We'll get into the details in later changes, in this change we'll only prepare
|
||||||
|
the code around the feature:
|
||||||
|
|
||||||
|
+ Create a `write-tree` CLI command
|
||||||
|
|
||||||
|
+ Create a `write_tree()` function in base module. Why in base module and not
|
||||||
|
in data module? Because `write_tree()` is not going to write to disk directly
|
||||||
|
but use the object database provided by data to store the directory. Hence it
|
||||||
|
belongs to the higher-level base module.
|
||||||
|
|
||||||
|
+ Add code to `write_tree()` to print a directory recursively. For now nothing
|
||||||
|
is written anywhere, but we just coded the boilerplate to recursively scan a
|
||||||
|
directory.
|
||||||
|
|
||||||
|
We continue in the next change.
|
||||||
8
how_to/Change_09.md
Normal file
8
how_to/Change_09.md
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
- write-tree: Ignore .ugit files
|
||||||
|
|
||||||
|
If we run `ugit write-tree`, we will see that it also prints the content of the
|
||||||
|
.ugit directory. This directory isn't part of the user's files, so let's ignore
|
||||||
|
it.
|
||||||
|
|
||||||
|
Actually, I created a separate `is_ignored()` function. This way if we have any
|
||||||
|
other files we want to ignore later we have one place to change.
|
||||||
12
how_to/Change_10.md
Normal file
12
how_to/Change_10.md
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
- write-tree: Hash the files
|
||||||
|
|
||||||
|
Instead of only printing the file name, let's put all files in the object
|
||||||
|
database. For now we'll print their OID and their name.
|
||||||
|
|
||||||
|
Notice that instead of getting one OID to represent a directory we now get a
|
||||||
|
separate OID for each file, which isn't very useful. Plus, note that the names
|
||||||
|
of the files aren't stored in the object database, they are just printed and
|
||||||
|
then the information is discarded.
|
||||||
|
|
||||||
|
So at this stage `write-tree` isn't useful (it just saves a bunch of files as
|
||||||
|
blobs) but the next change will fix it.
|
||||||
62
how_to/Change_11.md
Normal file
62
how_to/Change_11.md
Normal file
@@ -0,0 +1,62 @@
|
|||||||
|
- write-tree: Write tree objects
|
||||||
|
|
||||||
|
Now comes the fun part, where we turn a collection of separate files into a
|
||||||
|
single object that represents a directory.
|
||||||
|
|
||||||
|
|
||||||
|
The idea is that we will create one additional object that collects all the data
|
||||||
|
necessary to store a complete directory. For example, if we have a directory
|
||||||
|
with two files:
|
||||||
|
```
|
||||||
|
$ ls
|
||||||
|
cats.txt dogs.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
And we want to save the directory, we will first put the individual files into
|
||||||
|
the object database:
|
||||||
|
```
|
||||||
|
$ ugit hash-object cats.txt
|
||||||
|
91a7b14a584645c7b995100223e65f8a5a33b707
|
||||||
|
$ ugit hash-object dogs.txt
|
||||||
|
fa958e0dd2203e9ad56853a3f51e5945dad317a4
|
||||||
|
```
|
||||||
|
|
||||||
|
Then we will create a "tree" object that has the content of:
|
||||||
|
```
|
||||||
|
91a7b14a584645c7b995100223e65f8a5a33b707 cats.txt
|
||||||
|
fa958e0dd2203e9ad56853a3f51e5945dad317a4 dogs.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
And we will put this tree object into the object database as well. Then the OID
|
||||||
|
of the tree object will actually represent the entire directory! Why? Because we
|
||||||
|
can first retrieve the tree object by its OID, then see all the files it
|
||||||
|
contains (their names and OIDs) and then read all the OIDs of the files to get
|
||||||
|
their actual content.
|
||||||
|
|
||||||
|
What if our directory contains other directories? We'll just create tree objects
|
||||||
|
for them as well and we'll allow one tree object to point to another:
|
||||||
|
```
|
||||||
|
$ ls
|
||||||
|
cats.txt dogs.txt other/
|
||||||
|
$ ls other/
|
||||||
|
shoes.jpg
|
||||||
|
```
|
||||||
|
|
||||||
|
The root tree object will look like this:
|
||||||
|
```
|
||||||
|
blob 91a7b14a584645c7b995100223e65f8a5a33b707 cats.txt
|
||||||
|
blob fa958e0dd2203e9ad56853a3f51e5945dad317a4 dogs.txt
|
||||||
|
tree 53891a3c27b17e0f8fd96c058f968d19e340428d other
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that we added a type to each entry so that we know if it's a file or a
|
||||||
|
directory. The tree that represents the "other" directory (OID 53891a3c27b17e0f8fd96c058f968d19e340428d) looks like:
|
||||||
|
```
|
||||||
|
blob 0aa186b09fd81e8cf449ba10eee6aff9711cc1ac shoes.jpg
|
||||||
|
```
|
||||||
|
We can think about this structure as a tree you know from Computer Science where
|
||||||
|
each entries' OID as a pointer to either another tree or to a file (leaf node).
|
||||||
|
|
||||||
|
Note that we actually save the tree objects with type "tree" in
|
||||||
|
`data.hash_object()` since we don't want the trees to be confused with regular
|
||||||
|
files.
|
||||||
28
how_to/Change_12.md
Normal file
28
how_to/Change_12.md
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
- read-tree: Extract tree from object
|
||||||
|
|
||||||
|
This command will take an OID of a tree and extract it to the working directory.
|
||||||
|
Kind of the opposite of `write-tree`.
|
||||||
|
|
||||||
|
I divided the implementation into a few layers:
|
||||||
|
|
||||||
|
`_iter_tree_entries` is a generator that will take an OID of a tree, tokenize it
|
||||||
|
line-by-line and yield the raw string values.
|
||||||
|
|
||||||
|
`get_tree` uses `_iter_tree_entries` to recursively parse a tree into a
|
||||||
|
dictionary.
|
||||||
|
|
||||||
|
`read_tree` uses `get_tree` to get the file OIDs and writes them into the
|
||||||
|
working directory.
|
||||||
|
|
||||||
|
Now we can actually save versions of the working directory! It's nothing like
|
||||||
|
proper version control, but we can see that a super basic flow is possible:
|
||||||
|
|
||||||
|
+ Imagine you work on some code and you want to save a version.
|
||||||
|
+ You run ```ugit write-tree```.
|
||||||
|
+ You remember that OID that was printed out (write it on a post-it note or
|
||||||
|
something :)).
|
||||||
|
+ Continue working and repeat steps 2 and 3 as necessary.
|
||||||
|
+ If you want to return to a previous version, use `ugit read-tree` to restore
|
||||||
|
it to the working directory.
|
||||||
|
|
||||||
|
Is it convenient to use? No. But it's just the beginning!
|
||||||
7
how_to/Change_13.md
Normal file
7
how_to/Change_13.md
Normal file
@@ -0,0 +1,7 @@
|
|||||||
|
- read-tree: Delete all existing stuff before reading
|
||||||
|
|
||||||
|
This is done so that we won't have any old files left around after a read-tree.
|
||||||
|
|
||||||
|
Before this change, if we save tree A which contains only `a.txt`, then we save
|
||||||
|
tree B which contains `a.txt` and `b.txt` and then we `read-tree` A, we will
|
||||||
|
have `b.txt` left over in the working directory.
|
||||||
31
how_to/Change_14.md
Normal file
31
how_to/Change_14.md
Normal file
@@ -0,0 +1,31 @@
|
|||||||
|
- commit: Create commit
|
||||||
|
|
||||||
|
So far we were able to save versions of a directory (with `write-tree`), but
|
||||||
|
without any additional context. In reality, when we save a snapshot we would
|
||||||
|
like to attach data such as:
|
||||||
|
+ Message describing it
|
||||||
|
+ When the snapshot was created
|
||||||
|
+ Who created the snapshot
|
||||||
|
+ ...
|
||||||
|
|
||||||
|
We will create a new type of object called a "commit" that will store all this
|
||||||
|
information. A commit will just be a text file stored in the object database
|
||||||
|
with the type of `'commit'`.
|
||||||
|
|
||||||
|
The first lines in the commit will be key-values, then an empty line will mark
|
||||||
|
the end of the key-values and then the commit message will follow. Like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
tree 5e550586c91fce59e0006799e0d46b3948f05693
|
||||||
|
author Nikita Leshenko
|
||||||
|
time 2019-09-14T09:31:09+00:00
|
||||||
|
|
||||||
|
This is the commit message!
|
||||||
|
```
|
||||||
|
|
||||||
|
For now we'll just write the "tree" key and the commit message to the commit
|
||||||
|
object.
|
||||||
|
|
||||||
|
We will create a new `ugit commit` command that will accept a commit message,
|
||||||
|
snapshot the current directory using `ugit write-tree` and save the resulting
|
||||||
|
object.
|
||||||
10
how_to/Change_15.md
Normal file
10
how_to/Change_15.md
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
- commit: Record hash of last commit to HEAD
|
||||||
|
|
||||||
|
I would like to link new commits to older commits. Right now, if we make changes
|
||||||
|
in the working directory and make periodic commits, each commit will be a
|
||||||
|
standalone object, separate from all other commits. The motivation for linking
|
||||||
|
them together is so that we can look at the commits as a series of snapshots in
|
||||||
|
some order.
|
||||||
|
|
||||||
|
Before we can do it, let's record the OID of the last commit that we created.
|
||||||
|
We'll call the last commit the "HEAD" and just put the OID in .ugit/HEAD file.
|
||||||
23
how_to/Change_16.md
Normal file
23
how_to/Change_16.md
Normal file
@@ -0,0 +1,23 @@
|
|||||||
|
- commit: set parent to HEAD
|
||||||
|
|
||||||
|
When creating a new commit, we will use the HEAD to link the new commit to the
|
||||||
|
previous commit. We'll call the previous commit the "parent commit" and we will
|
||||||
|
save its OID in the "parent" key on the commit object.
|
||||||
|
|
||||||
|
For example, HEAD is currently bd0de093f1a0f90f54913d694a11cccf450bd990 and we
|
||||||
|
create a new commit, the new commit will look like this in the object store:
|
||||||
|
|
||||||
|
```
|
||||||
|
tree 50bed982245cd21e2798f179e0b032904398485b
|
||||||
|
parent bd0de093f1a0f90f54913d694a11cccf450bd990
|
||||||
|
|
||||||
|
This is the commit message!
|
||||||
|
```
|
||||||
|
|
||||||
|
The first commit in the repository will obviously have no parent.
|
||||||
|
|
||||||
|
Now we can retrieve the entire list of commits just by referencing the last
|
||||||
|
commit! We can start from the HEAD, read the "parent" key on the HEAD commit and
|
||||||
|
discover the commit before HEAD. Then read the parent of that commit, and go
|
||||||
|
back on and on... This is basically a linked list implemented over the object
|
||||||
|
database.
|
||||||
12
how_to/Change_17.md
Normal file
12
how_to/Change_17.md
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
- log: Implement
|
||||||
|
|
||||||
|
`log` will walk the list of commits and print them.
|
||||||
|
|
||||||
|
We will start by implementing `get_commit()` that will parse a commit object by
|
||||||
|
OID.
|
||||||
|
|
||||||
|
Then in the CLI module we will start from the HEAD commit and walk its parents
|
||||||
|
until we reach a commit without a parent.
|
||||||
|
|
||||||
|
The result is that the entire commit history is printed to the screen once we
|
||||||
|
run `ugit log`.
|
||||||
5
how_to/Change_18.md
Normal file
5
how_to/Change_18.md
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
- log: Add oid parameter
|
||||||
|
|
||||||
|
Just a small cosmetic change: Instead of always printing the list of commits
|
||||||
|
from HEAD, add an optional parameter to specify an alternative commit OID to
|
||||||
|
start from. By default it will still be HEAD.
|
||||||
90
how_to/Change_19.md
Normal file
90
how_to/Change_19.md
Normal file
@@ -0,0 +1,90 @@
|
|||||||
|
- checkout: Read tree and move HEAD
|
||||||
|
|
||||||
|
When given a commit OID, `ugit checkout` will "checkout" that commit, meaning
|
||||||
|
that it will populate the working directory with the content of the commit and
|
||||||
|
move HEAD to point to it.
|
||||||
|
|
||||||
|
This is a small but important change and it greatly expands the power of ugit in
|
||||||
|
two ways.
|
||||||
|
|
||||||
|
First, it allows us to travel conveniently in history. If we've made a handful
|
||||||
|
of commits and we would like to revisit a previous commit, we can now "checkout"
|
||||||
|
that commit to the working directory, play with it (compile, run tests, read
|
||||||
|
code, whatever we want) and checkout the latest commit again to resume working
|
||||||
|
where we've left.
|
||||||
|
|
||||||
|
You might be wondering why `checkout` is needed when we could just use
|
||||||
|
`read-tree`, and the answer is that moving HEAD in addition to reading the tree
|
||||||
|
allows us to record which commit is checked out right now. If we would only use
|
||||||
|
`read-tree` and later forget which commit we are looking at, we will see a bunch
|
||||||
|
of files in the working directory and have no idea where they came from. On the
|
||||||
|
other hand, if we use `checkout`, the commit will be recorded in HEAD and we can
|
||||||
|
always know what we're looking at (by running `ugit log` for example and seeing the first entry).
|
||||||
|
|
||||||
|
The second way by which `checkout` expands the power of ugit is by allowing
|
||||||
|
multiple branches of history. Let me explain: So far we have set HEAD to point
|
||||||
|
to the latest commit that was created. It means that all our commits were
|
||||||
|
linear, each new commit was added on top of the previous. The `checkout`
|
||||||
|
command now allows us to move HEAD to any commit we wish. Then, new commits will
|
||||||
|
be created on top of the current HEAD commit, which isn't necessarily the last
|
||||||
|
created commit.
|
||||||
|
|
||||||
|
For example, imagine that we're working on some code. So far, we have created a
|
||||||
|
few commits, represented by a graph:
|
||||||
|
```
|
||||||
|
o-----o-----o-----o
|
||||||
|
^ ^
|
||||||
|
first commit HEAD
|
||||||
|
```
|
||||||
|
|
||||||
|
Then we wanted to code a new feature. We created a few commits while working on
|
||||||
|
the feature (new commits represented by @):
|
||||||
|
```
|
||||||
|
o-----o-----o-----o-----@-----@-----@
|
||||||
|
^ ^
|
||||||
|
first commit HEAD
|
||||||
|
```
|
||||||
|
|
||||||
|
Now we have an alternative idea for implementing that feature. We would like to
|
||||||
|
go back in time and try a different implementation, without throwing away the
|
||||||
|
current implementation. We can remember the current HEAD and run `ugit checkout`
|
||||||
|
to go back in time, by providing the OID of the commit before the new feature
|
||||||
|
was implemented (that OID can be discovered with `ugit log`).
|
||||||
|
```
|
||||||
|
o-----o-----o-----o-----@-----@-----@
|
||||||
|
^ ^
|
||||||
|
first commit HEAD
|
||||||
|
```
|
||||||
|
|
||||||
|
The working directory will effectively go back in time. We can start working on
|
||||||
|
an alternative implementation and create new commit. The new commits will be on
|
||||||
|
top of HEAD and look like this (represented by $):
|
||||||
|
```
|
||||||
|
o-----o-----o-----o-----@-----@-----@
|
||||||
|
^ \
|
||||||
|
first commit ----$-----$
|
||||||
|
^
|
||||||
|
HEAD
|
||||||
|
```
|
||||||
|
|
||||||
|
See how the history now contains two "branches". We can actually switch back and
|
||||||
|
forth between them and work on them in parallel. Finally, we can checkout the
|
||||||
|
preferred implementation and work from it on future code. Assuming that we liked
|
||||||
|
the second branch, we'll just keep working from it, and future commits will look
|
||||||
|
like this:
|
||||||
|
```
|
||||||
|
o-----o-----o-----o-----@-----@-----@
|
||||||
|
^ \
|
||||||
|
first commit ----$-----$-----o-----o-----o-----o-----o
|
||||||
|
^
|
||||||
|
HEAD
|
||||||
|
```
|
||||||
|
|
||||||
|
Pretty useful, right? We've just introduced a simple form of branching history.
|
||||||
|
Note that something pretty cool happened here: The implementation of checkout is
|
||||||
|
very simple (we just call `read_tree` and update HEAD) but the implications of
|
||||||
|
checkout are quite big - we can suddenly have a branching workflow which might
|
||||||
|
look complicated but it is actually a direct consequence of what we implemented
|
||||||
|
in previous changes. This is why I believe learning Git internals from the
|
||||||
|
bottom up is useful - we can see how simple concepts compose into complicated
|
||||||
|
functionality.
|
||||||
41
how_to/Change_20.md
Normal file
41
how_to/Change_20.md
Normal file
@@ -0,0 +1,41 @@
|
|||||||
|
- tag: Implement CLI command
|
||||||
|
|
||||||
|
Now that we have branching history we have some OIDs we need to keep track of.
|
||||||
|
Assume we have two branches (continuing from the example we had for `checkout`):
|
||||||
|
```
|
||||||
|
o-----o-----o-----o-----@-----@-----@
|
||||||
|
^ \ ^
|
||||||
|
first commit ----$-----$ 6c9f80a187ba39b4...
|
||||||
|
^
|
||||||
|
d8d43b0e3a21df0c...
|
||||||
|
```
|
||||||
|
|
||||||
|
If we want to switch back and forth between the two "branches" with `checkout`,
|
||||||
|
we need to remember both OIDs, which are quite long.
|
||||||
|
|
||||||
|
To make our lives easier, let's implement a command to attach a name to an OID.
|
||||||
|
Then we'll be able to refer to the OID by that name.
|
||||||
|
|
||||||
|
The end result will look like this:
|
||||||
|
```
|
||||||
|
$ # Make some changes
|
||||||
|
...
|
||||||
|
$ ugit commit
|
||||||
|
d8d43b0e3a21df0c845e185d08be8e4028787069
|
||||||
|
$ ugit tag my-cool-commit d8d43b0e3a21df0c845e185d08be8e4028787069
|
||||||
|
$ # Make more changes
|
||||||
|
...
|
||||||
|
$ ugit commit
|
||||||
|
e549f09bbd08a8a888110b07982952e17e8c9669
|
||||||
|
|
||||||
|
$ ugit checkout my-cool-commit
|
||||||
|
or
|
||||||
|
$ ugit checkout d8d43b0e3a21df0c845e185d08be8e4028787069
|
||||||
|
```
|
||||||
|
|
||||||
|
The last two commands are equivalent, because "my-cool-commit" is a tag that
|
||||||
|
points to d8d43b0e3a21df0c845e185d08be8e4028787069.
|
||||||
|
|
||||||
|
We will implement this in a few steps. The first step is to create a CLI
|
||||||
|
commmand that call the relevant command in the base module. The base module does
|
||||||
|
nothing at this stage.
|
||||||
23
how_to/Change_21.md
Normal file
23
how_to/Change_21.md
Normal file
@@ -0,0 +1,23 @@
|
|||||||
|
- tag: Generalize HEAD to refs
|
||||||
|
|
||||||
|
As part of implementing `tag`, we'll generalize the way we handle HEAD. If you
|
||||||
|
think about it, HEAD and tags are similar. They are both ways for ugit to attach
|
||||||
|
a name to an OID. In case of HEAD, the name is hardcoded by ugit; in case of
|
||||||
|
tags, the name will be provided by the user. It makes sense to handle them
|
||||||
|
similarly in *data.py*.
|
||||||
|
|
||||||
|
In *data.py*, let's extend the function `set_HEAD` and `get_HEAD` to
|
||||||
|
`update_ref` and `get_ref`. "Ref" is a short for reference, and that's the name
|
||||||
|
Git uses. The function will now accept the name of the ref and write/read it as
|
||||||
|
a file under *.ugit* directory. Logically, a ref is a named pointer to an object.
|
||||||
|
|
||||||
|
The important change is in *data.py*. The rest of the changes just rename some
|
||||||
|
functions:
|
||||||
|
|
||||||
|
```
|
||||||
|
- get_HEAD() -> get_ref('HEAD')
|
||||||
|
- set_HEAD(oid) -> update_ref('HEAD', oid)
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that we didn't change any behaviour of ugit here, this is purely
|
||||||
|
refactoring.
|
||||||
28
how_to/Change_22.md
Normal file
28
how_to/Change_22.md
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
- tag: Create the tag ref
|
||||||
|
|
||||||
|
After we've implemented refs in the previous change, it's time to create a ref
|
||||||
|
when the user creates a tag.
|
||||||
|
|
||||||
|
`create_tag` now calls update_ref with the tag name to actually create the tag.
|
||||||
|
|
||||||
|
For namespacing purposes, we'll put all tags under *refs/tags/*. That is, if the
|
||||||
|
user creates *my-cool-commit* tag, we'll create *refs/tags/my-cool-commit* ref
|
||||||
|
to point to the desired OID.
|
||||||
|
|
||||||
|
Then we'll update *data.py* to handle this "namespaced" ref. Since we can't have
|
||||||
|
a / in the file name, we'll create directories for it. Now if a ref
|
||||||
|
*refs/tags/sometag* is created, it will be placed under *.ugit/refs/tags* in a
|
||||||
|
file named *sometag*.
|
||||||
|
|
||||||
|
To verify that this code works, you can run:
|
||||||
|
```
|
||||||
|
$ ugit tag test
|
||||||
|
```
|
||||||
|
|
||||||
|
And make sure that the tag points to HEAD:
|
||||||
|
```
|
||||||
|
$ cat .ugit/refs/tags/test
|
||||||
|
$ cat .ugit/HEAD
|
||||||
|
```
|
||||||
|
|
||||||
|
The last two commands should give the same output.
|
||||||
22
how_to/Change_23.md
Normal file
22
how_to/Change_23.md
Normal file
@@ -0,0 +1,22 @@
|
|||||||
|
- tag: Resolve name to oid in argparse
|
||||||
|
|
||||||
|
It's nice that we can create tags, but now let's actually make them usable from
|
||||||
|
the CLI.
|
||||||
|
|
||||||
|
In *base.py*, we'll create `get_oid` to resolve a "name" to an OID. A name can
|
||||||
|
either be a ref (in which case `get_oid` will return the OID that the ref points
|
||||||
|
to) or an OID (in which case `get_oid` will just return that same OID).
|
||||||
|
|
||||||
|
Next, we'll modify the argument parser in *cli.py* to call `get_oid` on all
|
||||||
|
arguments which are expected to be an OID. This way we can pass a ref there
|
||||||
|
instead of an OID.
|
||||||
|
|
||||||
|
At this point we can do something like:
|
||||||
|
```
|
||||||
|
$ ugit tag mytag d8d43b0e3a21df0c845e185d08be8e4028787069
|
||||||
|
$ ugit log refs/tags/mytag
|
||||||
|
# Will print log of commits starting at d8d43b0e...
|
||||||
|
$ ugit checkout refs/tags/mytag
|
||||||
|
# Will checkout commit d8d43b0e...
|
||||||
|
etc...
|
||||||
|
```
|
||||||
18
how_to/Change_24.md
Normal file
18
how_to/Change_24.md
Normal file
@@ -0,0 +1,18 @@
|
|||||||
|
- base: Try different directories when searching for a ref
|
||||||
|
|
||||||
|
In the previous change, you might have noticed that we need to spell out the
|
||||||
|
full name of a tag (Like *refs/tags/mytag*). This isn't very convenient, we
|
||||||
|
would like to have shorter command names. For example, if we've created "mytag"
|
||||||
|
tag, we should be able to do `ugit log mytag` rather than having to specify
|
||||||
|
`ugit log refs/tags/mytag`.
|
||||||
|
|
||||||
|
We'll extend `get_oid` to search in different ref subdirectories when resolving
|
||||||
|
a name. We'll search in:
|
||||||
|
```
|
||||||
|
Root (.ugit): This way we can specify refs/tags/mytag
|
||||||
|
.ugit/refs: This way we can specify tags/mytag
|
||||||
|
.ugit/refs/tags: This way we can specify mytag
|
||||||
|
.ugit/refs/heads: This will be needed for a future change
|
||||||
|
```
|
||||||
|
If we find the requested name in any of the directories, return it. Otherwise
|
||||||
|
assume that the name is an OID.
|
||||||
12
how_to/Change_25.md
Normal file
12
how_to/Change_25.md
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
- cli: pass HEAD by default in argparse
|
||||||
|
|
||||||
|
First, make "@" be an alias for HEAD. (Implemented in `get_oid`)
|
||||||
|
|
||||||
|
Second, do a little refactoring in *cli.py*. Some commands accept an optional
|
||||||
|
OID argument and if the argument isn't provided it defaults to HEAD. For example
|
||||||
|
`git log` can get an OID to start logging from, but by default it logs all
|
||||||
|
commits before HEAD.
|
||||||
|
|
||||||
|
Instead of having each command implement this logic, let's just make "@" (HEAD)
|
||||||
|
be the default value for those commands. The relevant commands at this stage
|
||||||
|
are `log` and `tag`. More will follow.
|
||||||
14
how_to/Change_26.md
Normal file
14
how_to/Change_26.md
Normal file
@@ -0,0 +1,14 @@
|
|||||||
|
- k: Print refs
|
||||||
|
|
||||||
|
Now that we have refs and a potentially branching commit history, it's a good
|
||||||
|
idea to create a visualization tool to see all the mess that we've created.
|
||||||
|
|
||||||
|
The visualization tool will draw all refs and all the commits pointed by the refs.
|
||||||
|
|
||||||
|
Our command to run the tool will be called `ugit k`, similar to `gitk` (which is
|
||||||
|
a graphical visualization tool for Git).
|
||||||
|
|
||||||
|
We'll create a new `k` command in *cli.py*. We'll create `iter_refs` which is a
|
||||||
|
generator which will iterate on all available refs (it will return HEAD from the
|
||||||
|
ugit root directory and everything under *.ugit/refs*). As a first step, let's
|
||||||
|
just print all refs when running `k`.
|
||||||
21
how_to/Change_27.md
Normal file
21
how_to/Change_27.md
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
- k: Iterate commits and parents
|
||||||
|
|
||||||
|
In addition to printing the refs, we'll also print all OIDs that are reachable
|
||||||
|
from those refs. We'll create `iter_commits_and_parents`, which is a generator
|
||||||
|
that returns all commits that it can reach from a given set of OIDs.
|
||||||
|
|
||||||
|
Note that `iter_commits_and_parents` will return an OID once, even if it's
|
||||||
|
reachable from multiple refs. Here, for example:
|
||||||
|
```
|
||||||
|
o<----o<----o<----o<----@<----@<----@
|
||||||
|
^ \ ^
|
||||||
|
first commit -<--$<----$ refs/tags/tag1
|
||||||
|
^
|
||||||
|
refs/tags/tag2
|
||||||
|
```
|
||||||
|
|
||||||
|
We can reach the first commit by following the parents of *tag1* or by following
|
||||||
|
the parents of *tag2*. Yet if we call `iter_commits_and_parents({tag1, tag2})`,
|
||||||
|
the first commit will be yielded only once. This property will be useful later.
|
||||||
|
|
||||||
|
(Note that nothing is visualized yet, we're preparing for that.)
|
||||||
18
how_to/Change_28.md
Normal file
18
how_to/Change_28.md
Normal file
@@ -0,0 +1,18 @@
|
|||||||
|
- k: Render graph
|
||||||
|
|
||||||
|
`k` is supposed to be a visualization tool, but so far we've just printed a
|
||||||
|
bunch of OIDs... Now comes the visualization part!
|
||||||
|
|
||||||
|
There's a convenient file format called "dot" that can describe a graph. This is
|
||||||
|
a textual format. We'll generate a graph of all commits and refs in dot format
|
||||||
|
and then visualize it using the "dot" utility that comes with Graphviz.
|
||||||
|
|
||||||
|
(If you're unfamiliar with dot or Graphviz please look it up online.)
|
||||||
|
|
||||||
|
The graph will contain a node for each commit, that points to the parent commit.
|
||||||
|
The graph will also contain a node for each ref, which points to the relevant
|
||||||
|
commit.
|
||||||
|
|
||||||
|
At this point, `ugit k` is fully functional and I encourage you to play with it.
|
||||||
|
Create a crazy branching history and a bunch of tags and see for yourself that
|
||||||
|
`ugit k` can draw all that visually.
|
||||||
9
how_to/Change_29.md
Normal file
9
how_to/Change_29.md
Normal file
@@ -0,0 +1,9 @@
|
|||||||
|
- log: Use `iter_commits_and_parents`
|
||||||
|
|
||||||
|
Refactoring ahead! Since we have `iter_commits_and_parents` from `k`, let's also
|
||||||
|
use this function in `log`. We'll need to adjust it a bit to use
|
||||||
|
`collections.deque` instead of a set so that the order of commits is deterministic.
|
||||||
|
|
||||||
|
This generalization might seem unneeded at this point, but it will be useful
|
||||||
|
later. (Note for the advanced folks: When we implement merge commits that have
|
||||||
|
multiple parents, this generic way to iterate will come in handy.)
|
||||||
82
how_to/Change_30.md
Normal file
82
how_to/Change_30.md
Normal file
@@ -0,0 +1,82 @@
|
|||||||
|
- branch: Create new branch
|
||||||
|
|
||||||
|
Tags were an improvement since they freed us from the burden of remembering OIDs
|
||||||
|
directly. But they are still somewhat inconvenient, since they are static. Let
|
||||||
|
me illustrate:
|
||||||
|
```
|
||||||
|
o-----o-----o-----o-----o-----o-----o
|
||||||
|
\ ^
|
||||||
|
----o-----o tag2,HEAD
|
||||||
|
^
|
||||||
|
tag1
|
||||||
|
```
|
||||||
|
|
||||||
|
If we have the above situation, we can easily flip between *tag1* and *tag2* with
|
||||||
|
`checkout`. But what happens if we do
|
||||||
|
|
||||||
|
- ugit checkout tag2
|
||||||
|
- Make some changes
|
||||||
|
- ugit commit?
|
||||||
|
|
||||||
|
Now it looks like this:
|
||||||
|
```
|
||||||
|
o-----o-----o-----o-----o-----o-----o-----o
|
||||||
|
\ ^ ^
|
||||||
|
----o-----o tag2 HEAD
|
||||||
|
^
|
||||||
|
tag1
|
||||||
|
```
|
||||||
|
|
||||||
|
The upper branch has advanced, but *tag2* still points to the previous commit.
|
||||||
|
This is by design, since tags are supposed to just name a specific OID. So if we
|
||||||
|
want to remember the new HEAD position we need to create another tag.
|
||||||
|
|
||||||
|
But now let's create a ref that will "move forward" as the branch grows. Just
|
||||||
|
like we have `ugit tag`, we'll create `ugit branch` that will point a branch to
|
||||||
|
a specific OID. This time the ref will be created under *refs/heads*.
|
||||||
|
|
||||||
|
At this stage, `branch` doesn't look any different from tag (the only difference
|
||||||
|
is that the branch is created under *refs/heads* rather than *refs/tags*). But
|
||||||
|
the magic will happen once we try to `checkout` a branch.
|
||||||
|
|
||||||
|
So far when we checkout anything we update HEAD to point to the OID that we've
|
||||||
|
just checked out. But if we checkout a branch by name, we'll do something
|
||||||
|
different, we will update HEAD to point to the **name of the branch!** Assume
|
||||||
|
that we have a branch here:
|
||||||
|
```
|
||||||
|
o-----o-----o-----o-----o-----o-----o
|
||||||
|
\ ^
|
||||||
|
----o-----o tag2,branch2
|
||||||
|
^
|
||||||
|
tag1
|
||||||
|
```
|
||||||
|
|
||||||
|
Running `ugit checkout branch2` will create the following situation:
|
||||||
|
```
|
||||||
|
o-----o-----o-----o-----o-----o-----o
|
||||||
|
\ ^
|
||||||
|
----o-----o tag2,branch2 <--- HEAD
|
||||||
|
^
|
||||||
|
tag1
|
||||||
|
```
|
||||||
|
|
||||||
|
You see? HEAD points to *branch2* rather than the OID of the commit directly.
|
||||||
|
Now if we create another commit, ugit will update HEAD to point to the latest
|
||||||
|
commit (just like it does every time) but as a side effect it will also update
|
||||||
|
*branch2* to point to the latest commit.
|
||||||
|
```
|
||||||
|
o-----o-----o-----o-----o-----o-----o-----o
|
||||||
|
\ ^ ^
|
||||||
|
----o-----o tag2 branch2 <--- HEAD
|
||||||
|
^
|
||||||
|
tag1
|
||||||
|
```
|
||||||
|
|
||||||
|
This way, if we checkout a branch and create some commits on top of it, the ref
|
||||||
|
will always point to the latest commit.
|
||||||
|
|
||||||
|
But right now HEAD (or any ref for that matter) may only point to an OID. It
|
||||||
|
can't point to another ref, like I described above. So our next step would be
|
||||||
|
to implement this concept. To mirror Git's terminology, we will call a ref that
|
||||||
|
points to another ref a "symbolic ref". Please see the next change for an
|
||||||
|
implementation of symbolic refs.
|
||||||
5
how_to/Change_31.md
Normal file
5
how_to/Change_31.md
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
- data: Implement symbolic refs idea
|
||||||
|
|
||||||
|
If the file that represents a ref contains an OID, we'll assume that the ref
|
||||||
|
points to an OID. If the file contains the content `ref: <refname>`, we'll
|
||||||
|
assume that the ref points to `<refname>` and we will dereference it recursively.
|
||||||
8
how_to/Change_32.md
Normal file
8
how_to/Change_32.md
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
- data: Create Refvalue container
|
||||||
|
|
||||||
|
To make working with symbolic refs easier, we will create a `Refvalue` container
|
||||||
|
to represent the value of a ref. `Refvalue` will have a property symbolic that
|
||||||
|
will say whether it's a symbolic or a direct ref.
|
||||||
|
|
||||||
|
This change is just refactoring, we will wrap every OID that is written or read
|
||||||
|
from a ref in a `RefValue`.
|
||||||
17
how_to/Change_33.md
Normal file
17
how_to/Change_33.md
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
data: Dereference refs when reading and writing
|
||||||
|
|
||||||
|
Now we'll dereference symbolic refs not only when reading them but also when
|
||||||
|
writing them.
|
||||||
|
|
||||||
|
We'll implement a helper function called `_get_ref_internal` which will return
|
||||||
|
the path and the value of the last ref pointed by a symbolic ref. In simple words:
|
||||||
|
|
||||||
|
- When given a non-symbolic ref, `_get_ref_internal` will return the ref name
|
||||||
|
and value.
|
||||||
|
- When given a symbolic ref, `_get_ref_internal` will dereference the ref
|
||||||
|
recursively, and then return the name of the last (non-symbolic) ref that points
|
||||||
|
to an OID, plus its value.
|
||||||
|
|
||||||
|
Now `update_ref` will use `_get_ref_internal` to know which ref it needs to update.
|
||||||
|
|
||||||
|
Additionally, we'll use `_get_ref_internal` in `get_ref`.
|
||||||
15
how_to/Change_34.md
Normal file
15
how_to/Change_34.md
Normal file
@@ -0,0 +1,15 @@
|
|||||||
|
- data: Don't always dereference refs (for `ugit k`)
|
||||||
|
|
||||||
|
Actually, it's not always desirable to dereference a ref all the way. Sometimes
|
||||||
|
we would like to know at which ref a symbolic ref points, rather than the final
|
||||||
|
OID. Or we would like to update a ref directly, rather then updating the last
|
||||||
|
ref in the chain.
|
||||||
|
|
||||||
|
One such usecase is `ugit k`. When visualizing refs it would be nice to see
|
||||||
|
which ref points to which ref. We will see another usecase soon.
|
||||||
|
|
||||||
|
To accomodate this, we will add a `deref` option to `get_ref`, `iter_refs` and
|
||||||
|
`update_ref`. If they will be called with `deref=False`, they will work on the
|
||||||
|
raw value of a ref and not dereference any symbolic refs.
|
||||||
|
|
||||||
|
Then we will update `k` to use `deref=False`.
|
||||||
176
ugit/base.py
Normal file
176
ugit/base.py
Normal file
@@ -0,0 +1,176 @@
|
|||||||
|
import itertools
|
||||||
|
import operator
|
||||||
|
import os
|
||||||
|
import string
|
||||||
|
|
||||||
|
from collections import deque, namedtuple
|
||||||
|
from pathlib import Path, PurePath
|
||||||
|
|
||||||
|
from . import data
|
||||||
|
|
||||||
|
|
||||||
|
def write_tree(directory="."):
|
||||||
|
entries = []
|
||||||
|
with Path.iterdir(directory) as it:
|
||||||
|
for entry in it:
|
||||||
|
full = f"{directory}/{entry.name}"
|
||||||
|
if is_ignored(full):
|
||||||
|
continue
|
||||||
|
if entry.is_file(follow_symlinks=False):
|
||||||
|
type_ = "blob"
|
||||||
|
with open(full, "rb") as f:
|
||||||
|
oid = data.hash_object(f.read())
|
||||||
|
elif entry.is_dir(follow_symlinks=False):
|
||||||
|
type_ = "tree"
|
||||||
|
oid = write_tree(full)
|
||||||
|
entries.append((entry.name, oid, type_))
|
||||||
|
|
||||||
|
tree = "".join(f"{type_} {oid} {name}\n" for name, oid, type_ in sorted(entries))
|
||||||
|
|
||||||
|
return data.hash_object(tree.encode(), "tree")
|
||||||
|
|
||||||
|
|
||||||
|
def _iter_tree_entries(oid):
|
||||||
|
if not oid:
|
||||||
|
return
|
||||||
|
tree = data.get_object(oid, "tree")
|
||||||
|
for entry in tree.decode().splitlines():
|
||||||
|
type_, oid, name = entry.split(" ", 2)
|
||||||
|
yield type_, oid, name
|
||||||
|
|
||||||
|
|
||||||
|
def get_tree(oid, base_path=""):
|
||||||
|
result = {}
|
||||||
|
for type_, oid, name in _iter_tree_entries(oid):
|
||||||
|
assert "/" not in name
|
||||||
|
assert name not in ("..", ".")
|
||||||
|
path = base_path + name
|
||||||
|
if type_ == "blob":
|
||||||
|
result[path] = oid
|
||||||
|
elif type_ == "tree":
|
||||||
|
result.update(get_tree(oid, f"{path}/"))
|
||||||
|
else:
|
||||||
|
assert False, f"Unknown tree entry {type_}"
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
def _empty_current_directory():
|
||||||
|
for root, dirnames, filenames in os.walk(".", topdown=False):
|
||||||
|
for filename in filenames:
|
||||||
|
path = PurePath.relative_to(f"{root}/{filename}")
|
||||||
|
if is_ignored(path) or not Path.is_file(path):
|
||||||
|
continue
|
||||||
|
Path.unlink(path)
|
||||||
|
for dirname in dirnames:
|
||||||
|
path = PurePath.relative_to(f"{root}/{dirname}")
|
||||||
|
if is_ignored(path):
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
Path.rmdir(path)
|
||||||
|
except (FileNotFoundError, OSError):
|
||||||
|
# Deletion might fail if the directory contains ignored files,
|
||||||
|
# so it's OK
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
def read_tree(tree_oid):
|
||||||
|
_empty_current_directory()
|
||||||
|
for path, oid in get_tree(tree_oid, base_path="./").items():
|
||||||
|
Path.mkdir(PurePath.parent(path), exist_ok=True)
|
||||||
|
with open(path, "wb") as f:
|
||||||
|
f.write(data.get_object(oid))
|
||||||
|
|
||||||
|
|
||||||
|
def commit(message):
|
||||||
|
commit = f"tree {write_tree()}\n"
|
||||||
|
|
||||||
|
HEAD = data.get_ref("HEAD").value
|
||||||
|
if HEAD:
|
||||||
|
commit += f"parent {HEAD}\n"
|
||||||
|
|
||||||
|
commit += "\n"
|
||||||
|
commit += f"{message}\n"
|
||||||
|
|
||||||
|
oid = data.hash_object(commit.encode(), "commit")
|
||||||
|
|
||||||
|
data.update_ref("HEAD", data.RefValue(symbolic=False, value=oid))
|
||||||
|
|
||||||
|
return oid
|
||||||
|
|
||||||
|
|
||||||
|
def create_tag(name, oid):
|
||||||
|
data.update_ref(f"refs/tags/{name}", data.RefValue(symbolic=False, value=oid))
|
||||||
|
|
||||||
|
|
||||||
|
def checkout(oid):
|
||||||
|
commit = get_commit(oid)
|
||||||
|
read_tree(commit.tree)
|
||||||
|
data.update_ref("HEAD", data.RefValue(symbolic=False, value=oid))
|
||||||
|
|
||||||
|
|
||||||
|
def create_branch(name, oid):
|
||||||
|
data.update_ref(f"refs/heads/{name}", data.RefValue(symbolic=False, value=oid))
|
||||||
|
|
||||||
|
|
||||||
|
Commit = namedtuple("Commit", ["tree", "parent", "message"])
|
||||||
|
|
||||||
|
|
||||||
|
def get_commit(oid):
|
||||||
|
parent = None
|
||||||
|
|
||||||
|
commit = data.get_object(oid, "commit").decode()
|
||||||
|
lines = iter(commit.splitlines())
|
||||||
|
for line in itertools.takewhile(operator.truth, lines):
|
||||||
|
key, value = line.split(" ", 1)
|
||||||
|
if key == "tree":
|
||||||
|
tree = value
|
||||||
|
elif key == "parent":
|
||||||
|
parent = value
|
||||||
|
else:
|
||||||
|
assert False, f"Unknown field {key}"
|
||||||
|
|
||||||
|
message = "\n".join(lines)
|
||||||
|
return Commit(tree=tree, parent=parent, message=message)
|
||||||
|
|
||||||
|
|
||||||
|
def iter_commits_and_parents(oids):
|
||||||
|
oids = deque(oids)
|
||||||
|
visited = set()
|
||||||
|
|
||||||
|
while oids:
|
||||||
|
oid = oids.popleft()
|
||||||
|
if not oid or oid in visited:
|
||||||
|
continue
|
||||||
|
visited.add(oid)
|
||||||
|
yield oid
|
||||||
|
|
||||||
|
commit = get_commit(oid)
|
||||||
|
# Return parent next
|
||||||
|
oids.appendleft(commit.parent)
|
||||||
|
|
||||||
|
|
||||||
|
def get_oid(name):
|
||||||
|
if name == "@":
|
||||||
|
name = "HEAD"
|
||||||
|
|
||||||
|
# Name is ref
|
||||||
|
refs_to_try = [
|
||||||
|
f"{name}",
|
||||||
|
f"refs/{name}",
|
||||||
|
f"refs/tags/{name}",
|
||||||
|
f"refs/heads/{name}",
|
||||||
|
]
|
||||||
|
for ref in refs_to_try:
|
||||||
|
if data.get_ref(ref, deref=False).value:
|
||||||
|
return data.get_ref(ref).value
|
||||||
|
|
||||||
|
# Name is SHA1
|
||||||
|
is_hex = all(c in string.hexdigits for c in name)
|
||||||
|
if len(name) == 40 and is_hex:
|
||||||
|
return name
|
||||||
|
|
||||||
|
assert False, f"Unknown name {name}"
|
||||||
|
|
||||||
|
|
||||||
|
def is_ignored(path):
|
||||||
|
return ".ugit" in path.split("/")
|
||||||
107
ugit/cli.py
107
ugit/cli.py
@@ -1,6 +1,11 @@
|
|||||||
import argparse
|
import argparse
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
import textwrap
|
||||||
|
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
|
from . import base
|
||||||
from . import data
|
from . import data
|
||||||
|
|
||||||
|
|
||||||
@@ -15,6 +20,8 @@ def parse_args():
|
|||||||
commands = parser.add_subparsers(dest="command")
|
commands = parser.add_subparsers(dest="command")
|
||||||
commands.required = True
|
commands.required = True
|
||||||
|
|
||||||
|
oid = base.get_oid
|
||||||
|
|
||||||
init_parser = commands.add_parser("init")
|
init_parser = commands.add_parser("init")
|
||||||
init_parser.set_defaults(func=init)
|
init_parser.set_defaults(func=init)
|
||||||
|
|
||||||
@@ -22,6 +29,42 @@ def parse_args():
|
|||||||
hash_object_parser.set_defaults(func=hash_object)
|
hash_object_parser.set_defaults(func=hash_object)
|
||||||
hash_object_parser.add_argument("file")
|
hash_object_parser.add_argument("file")
|
||||||
|
|
||||||
|
cat_file_parser = commands.add_parser("cat-file")
|
||||||
|
cat_file_parser.set_defaults(func=cat_file)
|
||||||
|
cat_file_parser.add_argument("object", type=oid)
|
||||||
|
|
||||||
|
write_tree_parser = commands.add_parser("write-tree")
|
||||||
|
write_tree_parser.set_defaults(func=write_tree)
|
||||||
|
|
||||||
|
read_tree_parser = commands.add_parser("read-tree")
|
||||||
|
read_tree_parser.set_defaults(func=read_tree)
|
||||||
|
read_tree_parser.add_argument("tree", type=oid)
|
||||||
|
|
||||||
|
commit_parser = commands.add_parser("commit")
|
||||||
|
commit_parser.set_defaults(func=commit)
|
||||||
|
commit_parser.add_argument("-m", "--message", required=True)
|
||||||
|
|
||||||
|
log_parser = commands.add_parser("log")
|
||||||
|
log_parser.set_defaults(func=log)
|
||||||
|
log_parser.add_argument("oid", default="@", type=oid, nargs="?")
|
||||||
|
|
||||||
|
checkout_parser = commands.add_parser("checkout")
|
||||||
|
checkout_parser.set_defaults(func=checkout)
|
||||||
|
checkout_parser.add_argument("oid", type=oid)
|
||||||
|
|
||||||
|
tag_parser = commands.add_parser("tag")
|
||||||
|
tag_parser.set_defaults(func=tag)
|
||||||
|
tag_parser.add_argument("name")
|
||||||
|
tag_parser.add_argument("oid", default="@", type=oid, nargs="?")
|
||||||
|
|
||||||
|
branch_parser = commands.add_parser("branch")
|
||||||
|
branch_parser.set_defaults(func=branch)
|
||||||
|
branch_parser.add_argument("name")
|
||||||
|
branch_parser.add_argument("start_point", default="@", type=oid, nargs="?")
|
||||||
|
|
||||||
|
k_parser = commands.add_parser("k")
|
||||||
|
k_parser.set_defaults(func=k)
|
||||||
|
|
||||||
return parser.parse_args()
|
return parser.parse_args()
|
||||||
|
|
||||||
|
|
||||||
@@ -33,3 +76,67 @@ def init(args):
|
|||||||
def hash_object(args):
|
def hash_object(args):
|
||||||
with open(args.file, "rb") as f:
|
with open(args.file, "rb") as f:
|
||||||
print(data.hash_object(f.read()))
|
print(data.hash_object(f.read()))
|
||||||
|
|
||||||
|
|
||||||
|
def cat_file(args):
|
||||||
|
sys.stdout.flush()
|
||||||
|
sys.stdout.buffer.write(data.get_object(args.object), expected=None)
|
||||||
|
|
||||||
|
|
||||||
|
def write_tree(args):
|
||||||
|
print(base.write_tree())
|
||||||
|
|
||||||
|
|
||||||
|
def read_tree(args):
|
||||||
|
base.read_tree(args.tree)
|
||||||
|
|
||||||
|
|
||||||
|
def commit(args):
|
||||||
|
print(base.commit(args.message))
|
||||||
|
|
||||||
|
|
||||||
|
def log(args):
|
||||||
|
for oid in base.iter_commits_and_parents({args.oid}):
|
||||||
|
commit = base.get_commit(oid)
|
||||||
|
|
||||||
|
print(f"commit {oid}\n")
|
||||||
|
print(textwrap.indent(commit.message, " "))
|
||||||
|
print("")
|
||||||
|
|
||||||
|
|
||||||
|
def checkout(args):
|
||||||
|
base.checkout(args.oid)
|
||||||
|
|
||||||
|
|
||||||
|
def tag(args):
|
||||||
|
base.create_tag(args.name, args.oid)
|
||||||
|
|
||||||
|
|
||||||
|
def branch(args):
|
||||||
|
base.create_branch(args.name, args.start_point)
|
||||||
|
print(f"Branch {args.name} created at {args.start_point[:10]}")
|
||||||
|
|
||||||
|
|
||||||
|
def k(args):
|
||||||
|
dot = "digraph commits {\n"
|
||||||
|
|
||||||
|
oids = set()
|
||||||
|
for refname, ref in data.iter_refs(deref=False):
|
||||||
|
dot += f"'{refname}' [shape=note]\n"
|
||||||
|
dot += f"'{refname}' -> '{ref.value}'\n"
|
||||||
|
if not ref.symbolic:
|
||||||
|
oids.add(ref.value)
|
||||||
|
|
||||||
|
for oid in base.iter_commits_and_parents(oids):
|
||||||
|
commit = base.get_commit(oid)
|
||||||
|
dot += f"'{oid}' [shape=box style=filled label='{oid[:10]}']\n"
|
||||||
|
if commit.parent:
|
||||||
|
dot += f"'{oid}' -> '{commit.parent}'\n"
|
||||||
|
|
||||||
|
dot += "}"
|
||||||
|
print(dot)
|
||||||
|
|
||||||
|
with subprocess.Popen(
|
||||||
|
["dot", "-Tgtk", "/dev/stdin"], stdin=subprocess.PIPE
|
||||||
|
) as proc:
|
||||||
|
proc.communicate(dot.encode())
|
||||||
|
|||||||
65
ugit/data.py
65
ugit/data.py
@@ -1,6 +1,9 @@
|
|||||||
from pathlib import Path
|
from pathlib import Path, PurePath
|
||||||
|
|
||||||
import hashlib
|
import hashlib
|
||||||
|
import os
|
||||||
|
|
||||||
|
from collections import namedtuple
|
||||||
|
|
||||||
GIT_DIR = ".ugit"
|
GIT_DIR = ".ugit"
|
||||||
|
|
||||||
@@ -10,8 +13,62 @@ def init():
|
|||||||
Path.mkdir(f"{GIT_DIR}/objects")
|
Path.mkdir(f"{GIT_DIR}/objects")
|
||||||
|
|
||||||
|
|
||||||
def hash_object(data):
|
RefValue = namedtuple("RefValue", ["symbolic", "value"])
|
||||||
oid = hashlib.sha1(data).hexdigest()
|
|
||||||
|
|
||||||
|
def update_ref(ref, value, deref=True):
|
||||||
|
assert not value.symbolic
|
||||||
|
ref = _get_ref_internal(ref, deref)[0]
|
||||||
|
ref_path = f"{GIT_DIR}/{ref}"
|
||||||
|
Path.mkdir(ref_path, exist_ok=True)
|
||||||
|
with open(ref_path, "w") as f:
|
||||||
|
f.write(value.value)
|
||||||
|
|
||||||
|
|
||||||
|
def get_ref(ref):
|
||||||
|
return _get_ref_internal(ref)[1]
|
||||||
|
|
||||||
|
|
||||||
|
def _get_ref_internal(ref):
|
||||||
|
ref_path = f"{GIT_DIR}/{ref}"
|
||||||
|
value = None
|
||||||
|
if Path.is_file(ref_path):
|
||||||
|
with open(ref_path) as f:
|
||||||
|
value = f.read().strip()
|
||||||
|
|
||||||
|
symbolic = bool(value) and value.startswith("ref")
|
||||||
|
if symbolic:
|
||||||
|
value = value.split(":", 1)[1].strip()
|
||||||
|
return _get_ref_internal(value)
|
||||||
|
|
||||||
|
return ref, RefValue(symbolic=False, value=value)
|
||||||
|
|
||||||
|
|
||||||
|
def iter_refs():
|
||||||
|
refs = ["HEAD"]
|
||||||
|
for root, _, filenames in Path.walk(f"{GIT_DIR}/refs"):
|
||||||
|
root = PurePath.relative_to(root, GIT_DIR)
|
||||||
|
refs.extend(f"{root}/{name}" for name in filenames)
|
||||||
|
|
||||||
|
for refname in refs:
|
||||||
|
yield refname, get_ref(refname)
|
||||||
|
|
||||||
|
|
||||||
|
def hash_object(data, type_="blob"):
|
||||||
|
obj = type_.encode() + b"\x00" + data
|
||||||
|
oid = hashlib.sha1(obj).hexdigest()
|
||||||
with open(f"{GIT_DIR}/objects/{oid}", "wb") as out:
|
with open(f"{GIT_DIR}/objects/{oid}", "wb") as out:
|
||||||
out.write(data)
|
out.write(obj)
|
||||||
return oid
|
return oid
|
||||||
|
|
||||||
|
|
||||||
|
def get_object(oid, expected="blob"):
|
||||||
|
with open(f"{GIT_DIR}/objects/{oid}", "rb") as f:
|
||||||
|
obj = f.read()
|
||||||
|
|
||||||
|
type_, _, content = obj.partition(b"\x00")
|
||||||
|
type_ = type_.decode()
|
||||||
|
|
||||||
|
if expected is not None:
|
||||||
|
assert type_ == expected, f"Expected {expected}, got {type_}"
|
||||||
|
return content
|
||||||
|
|||||||
Reference in New Issue
Block a user