Compare commits

..

31 Commits

Author SHA1 Message Date
c484b41a89 Don't always derefence ref 2024-07-06 19:50:52 +02:00
fe5ed910a3 Add change 34 to instructions 2024-07-06 19:49:59 +02:00
9c53919802 Dereference refs when reading and writing 2024-06-29 18:48:21 +02:00
30ce8c84e4 Add change 33 to instructions 2024-06-29 18:47:23 +02:00
6841a97d18 Create RefValue container 2024-06-14 16:29:58 +02:00
556c16c081 Add change 32 to instructions 2024-06-14 16:29:25 +02:00
7a0f86e49b Implement symbolic refs idea 2024-06-09 21:20:53 +02:00
3770c81942 Implement symbolic refs idea 2024-06-09 21:20:38 +02:00
9f8fde3c60 Create new branch 2024-06-05 20:18:40 +02:00
772f631768 Add change 30 to instructions 2024-06-05 20:18:06 +02:00
7fe3e0f497 Use iter_commits_and_parents 2024-05-24 14:45:06 +02:00
7896b80c42 Add change 29 to instructions 2024-05-24 14:44:43 +02:00
b854b4fa18 Render graph 2024-05-22 16:52:44 +02:00
2362d69673 Add change 28 to instructions 2024-05-22 16:52:14 +02:00
d53322c256 Iterate commits and parents 2024-05-16 12:01:30 +02:00
7fbf6640f6 Iterate commits and parents 2024-05-16 11:54:52 +02:00
dad9077515 Add change 27 to instructions 2024-05-16 11:54:28 +02:00
db7d608010 Print refs for k (visualization tool) 2024-05-15 11:01:19 +02:00
c9d8b443ed Add change 26 to instructions 2024-05-15 10:59:54 +02:00
41333f06bc pass HEAD by default to argparse 2024-05-05 21:04:28 +02:00
fe292c02c9 Add change 25 instructions 2024-05-05 21:04:02 +02:00
de595261e6 Try different dirextories when searching for a ref 2024-04-23 17:36:41 +02:00
81bf86d41b Add change 24 instructions 2024-04-23 17:36:02 +02:00
671fa4b6b1 Resolve name to oid in argparse 2024-04-20 21:38:04 +02:00
63dcbeb9e7 Add change 23 instructions 2024-04-20 21:37:31 +02:00
edae32dc86 Create the tag ref 2024-04-17 19:33:54 +02:00
e85766f671 Add change 22 instructions 2024-04-17 19:33:24 +02:00
1f947e6343 Generalize HEAD to refs 2024-04-12 17:19:14 +02:00
cb8e744794 Add change 21 instructions 2024-04-12 17:18:48 +02:00
6797bcfabe Implement CLI command for tagging 2024-04-03 19:47:53 +02:00
95355befb4 Implement CLI command for tagging 2024-04-03 19:47:35 +02:00
18 changed files with 481 additions and 20 deletions

41
how_to/Change_20.md Normal file
View File

@@ -0,0 +1,41 @@
- tag: Implement CLI command
Now that we have branching history we have some OIDs we need to keep track of.
Assume we have two branches (continuing from the example we had for `checkout`):
```
o-----o-----o-----o-----@-----@-----@
^ \ ^
first commit ----$-----$ 6c9f80a187ba39b4...
^
d8d43b0e3a21df0c...
```
If we want to switch back and forth between the two "branches" with `checkout`,
we need to remember both OIDs, which are quite long.
To make our lives easier, let's implement a command to attach a name to an OID.
Then we'll be able to refer to the OID by that name.
The end result will look like this:
```
$ # Make some changes
...
$ ugit commit
d8d43b0e3a21df0c845e185d08be8e4028787069
$ ugit tag my-cool-commit d8d43b0e3a21df0c845e185d08be8e4028787069
$ # Make more changes
...
$ ugit commit
e549f09bbd08a8a888110b07982952e17e8c9669
$ ugit checkout my-cool-commit
or
$ ugit checkout d8d43b0e3a21df0c845e185d08be8e4028787069
```
The last two commands are equivalent, because "my-cool-commit" is a tag that
points to d8d43b0e3a21df0c845e185d08be8e4028787069.
We will implement this in a few steps. The first step is to create a CLI
commmand that call the relevant command in the base module. The base module does
nothing at this stage.

23
how_to/Change_21.md Normal file
View File

@@ -0,0 +1,23 @@
- tag: Generalize HEAD to refs
As part of implementing `tag`, we'll generalize the way we handle HEAD. If you
think about it, HEAD and tags are similar. They are both ways for ugit to attach
a name to an OID. In case of HEAD, the name is hardcoded by ugit; in case of
tags, the name will be provided by the user. It makes sense to handle them
similarly in *data.py*.
In *data.py*, let's extend the function `set_HEAD` and `get_HEAD` to
`update_ref` and `get_ref`. "Ref" is a short for reference, and that's the name
Git uses. The function will now accept the name of the ref and write/read it as
a file under *.ugit* directory. Logically, a ref is a named pointer to an object.
The important change is in *data.py*. The rest of the changes just rename some
functions:
```
- get_HEAD() -> get_ref('HEAD')
- set_HEAD(oid) -> update_ref('HEAD', oid)
```
Note that we didn't change any behaviour of ugit here, this is purely
refactoring.

28
how_to/Change_22.md Normal file
View File

@@ -0,0 +1,28 @@
- tag: Create the tag ref
After we've implemented refs in the previous change, it's time to create a ref
when the user creates a tag.
`create_tag` now calls update_ref with the tag name to actually create the tag.
For namespacing purposes, we'll put all tags under *refs/tags/*. That is, if the
user creates *my-cool-commit* tag, we'll create *refs/tags/my-cool-commit* ref
to point to the desired OID.
Then we'll update *data.py* to handle this "namespaced" ref. Since we can't have
a / in the file name, we'll create directories for it. Now if a ref
*refs/tags/sometag* is created, it will be placed under *.ugit/refs/tags* in a
file named *sometag*.
To verify that this code works, you can run:
```
$ ugit tag test
```
And make sure that the tag points to HEAD:
```
$ cat .ugit/refs/tags/test
$ cat .ugit/HEAD
```
The last two commands should give the same output.

22
how_to/Change_23.md Normal file
View File

@@ -0,0 +1,22 @@
- tag: Resolve name to oid in argparse
It's nice that we can create tags, but now let's actually make them usable from
the CLI.
In *base.py*, we'll create `get_oid` to resolve a "name" to an OID. A name can
either be a ref (in which case `get_oid` will return the OID that the ref points
to) or an OID (in which case `get_oid` will just return that same OID).
Next, we'll modify the argument parser in *cli.py* to call `get_oid` on all
arguments which are expected to be an OID. This way we can pass a ref there
instead of an OID.
At this point we can do something like:
```
$ ugit tag mytag d8d43b0e3a21df0c845e185d08be8e4028787069
$ ugit log refs/tags/mytag
# Will print log of commits starting at d8d43b0e...
$ ugit checkout refs/tags/mytag
# Will checkout commit d8d43b0e...
etc...
```

18
how_to/Change_24.md Normal file
View File

@@ -0,0 +1,18 @@
- base: Try different directories when searching for a ref
In the previous change, you might have noticed that we need to spell out the
full name of a tag (Like *refs/tags/mytag*). This isn't very convenient, we
would like to have shorter command names. For example, if we've created "mytag"
tag, we should be able to do `ugit log mytag` rather than having to specify
`ugit log refs/tags/mytag`.
We'll extend `get_oid` to search in different ref subdirectories when resolving
a name. We'll search in:
```
Root (.ugit): This way we can specify refs/tags/mytag
.ugit/refs: This way we can specify tags/mytag
.ugit/refs/tags: This way we can specify mytag
.ugit/refs/heads: This will be needed for a future change
```
If we find the requested name in any of the directories, return it. Otherwise
assume that the name is an OID.

12
how_to/Change_25.md Normal file
View File

@@ -0,0 +1,12 @@
- cli: pass HEAD by default in argparse
First, make "@" be an alias for HEAD. (Implemented in `get_oid`)
Second, do a little refactoring in *cli.py*. Some commands accept an optional
OID argument and if the argument isn't provided it defaults to HEAD. For example
`git log` can get an OID to start logging from, but by default it logs all
commits before HEAD.
Instead of having each command implement this logic, let's just make "@" (HEAD)
be the default value for those commands. The relevant commands at this stage
are `log` and `tag`. More will follow.

14
how_to/Change_26.md Normal file
View File

@@ -0,0 +1,14 @@
- k: Print refs
Now that we have refs and a potentially branching commit history, it's a good
idea to create a visualization tool to see all the mess that we've created.
The visualization tool will draw all refs and all the commits pointed by the refs.
Our command to run the tool will be called `ugit k`, similar to `gitk` (which is
a graphical visualization tool for Git).
We'll create a new `k` command in *cli.py*. We'll create `iter_refs` which is a
generator which will iterate on all available refs (it will return HEAD from the
ugit root directory and everything under *.ugit/refs*). As a first step, let's
just print all refs when running `k`.

21
how_to/Change_27.md Normal file
View File

@@ -0,0 +1,21 @@
- k: Iterate commits and parents
In addition to printing the refs, we'll also print all OIDs that are reachable
from those refs. We'll create `iter_commits_and_parents`, which is a generator
that returns all commits that it can reach from a given set of OIDs.
Note that `iter_commits_and_parents` will return an OID once, even if it's
reachable from multiple refs. Here, for example:
```
o<----o<----o<----o<----@<----@<----@
^ \ ^
first commit -<--$<----$ refs/tags/tag1
^
refs/tags/tag2
```
We can reach the first commit by following the parents of *tag1* or by following
the parents of *tag2*. Yet if we call `iter_commits_and_parents({tag1, tag2})`,
the first commit will be yielded only once. This property will be useful later.
(Note that nothing is visualized yet, we're preparing for that.)

18
how_to/Change_28.md Normal file
View File

@@ -0,0 +1,18 @@
- k: Render graph
`k` is supposed to be a visualization tool, but so far we've just printed a
bunch of OIDs... Now comes the visualization part!
There's a convenient file format called "dot" that can describe a graph. This is
a textual format. We'll generate a graph of all commits and refs in dot format
and then visualize it using the "dot" utility that comes with Graphviz.
(If you're unfamiliar with dot or Graphviz please look it up online.)
The graph will contain a node for each commit, that points to the parent commit.
The graph will also contain a node for each ref, which points to the relevant
commit.
At this point, `ugit k` is fully functional and I encourage you to play with it.
Create a crazy branching history and a bunch of tags and see for yourself that
`ugit k` can draw all that visually.

9
how_to/Change_29.md Normal file
View File

@@ -0,0 +1,9 @@
- log: Use `iter_commits_and_parents`
Refactoring ahead! Since we have `iter_commits_and_parents` from `k`, let's also
use this function in `log`. We'll need to adjust it a bit to use
`collections.deque` instead of a set so that the order of commits is deterministic.
This generalization might seem unneeded at this point, but it will be useful
later. (Note for the advanced folks: When we implement merge commits that have
multiple parents, this generic way to iterate will come in handy.)

82
how_to/Change_30.md Normal file
View File

@@ -0,0 +1,82 @@
- branch: Create new branch
Tags were an improvement since they freed us from the burden of remembering OIDs
directly. But they are still somewhat inconvenient, since they are static. Let
me illustrate:
```
o-----o-----o-----o-----o-----o-----o
\ ^
----o-----o tag2,HEAD
^
tag1
```
If we have the above situation, we can easily flip between *tag1* and *tag2* with
`checkout`. But what happens if we do
- ugit checkout tag2
- Make some changes
- ugit commit?
Now it looks like this:
```
o-----o-----o-----o-----o-----o-----o-----o
\ ^ ^
----o-----o tag2 HEAD
^
tag1
```
The upper branch has advanced, but *tag2* still points to the previous commit.
This is by design, since tags are supposed to just name a specific OID. So if we
want to remember the new HEAD position we need to create another tag.
But now let's create a ref that will "move forward" as the branch grows. Just
like we have `ugit tag`, we'll create `ugit branch` that will point a branch to
a specific OID. This time the ref will be created under *refs/heads*.
At this stage, `branch` doesn't look any different from tag (the only difference
is that the branch is created under *refs/heads* rather than *refs/tags*). But
the magic will happen once we try to `checkout` a branch.
So far when we checkout anything we update HEAD to point to the OID that we've
just checked out. But if we checkout a branch by name, we'll do something
different, we will update HEAD to point to the **name of the branch!** Assume
that we have a branch here:
```
o-----o-----o-----o-----o-----o-----o
\ ^
----o-----o tag2,branch2
^
tag1
```
Running `ugit checkout branch2` will create the following situation:
```
o-----o-----o-----o-----o-----o-----o
\ ^
----o-----o tag2,branch2 <--- HEAD
^
tag1
```
You see? HEAD points to *branch2* rather than the OID of the commit directly.
Now if we create another commit, ugit will update HEAD to point to the latest
commit (just like it does every time) but as a side effect it will also update
*branch2* to point to the latest commit.
```
o-----o-----o-----o-----o-----o-----o-----o
\ ^ ^
----o-----o tag2 branch2 <--- HEAD
^
tag1
```
This way, if we checkout a branch and create some commits on top of it, the ref
will always point to the latest commit.
But right now HEAD (or any ref for that matter) may only point to an OID. It
can't point to another ref, like I described above. So our next step would be
to implement this concept. To mirror Git's terminology, we will call a ref that
points to another ref a "symbolic ref". Please see the next change for an
implementation of symbolic refs.

5
how_to/Change_31.md Normal file
View File

@@ -0,0 +1,5 @@
- data: Implement symbolic refs idea
If the file that represents a ref contains an OID, we'll assume that the ref
points to an OID. If the file contains the content `ref: <refname>`, we'll
assume that the ref points to `<refname>` and we will dereference it recursively.

8
how_to/Change_32.md Normal file
View File

@@ -0,0 +1,8 @@
- data: Create Refvalue container
To make working with symbolic refs easier, we will create a `Refvalue` container
to represent the value of a ref. `Refvalue` will have a property symbolic that
will say whether it's a symbolic or a direct ref.
This change is just refactoring, we will wrap every OID that is written or read
from a ref in a `RefValue`.

17
how_to/Change_33.md Normal file
View File

@@ -0,0 +1,17 @@
data: Dereference refs when reading and writing
Now we'll dereference symbolic refs not only when reading them but also when
writing them.
We'll implement a helper function called `_get_ref_internal` which will return
the path and the value of the last ref pointed by a symbolic ref. In simple words:
- When given a non-symbolic ref, `_get_ref_internal` will return the ref name
and value.
- When given a symbolic ref, `_get_ref_internal` will dereference the ref
recursively, and then return the name of the last (non-symbolic) ref that points
to an OID, plus its value.
Now `update_ref` will use `_get_ref_internal` to know which ref it needs to update.
Additionally, we'll use `_get_ref_internal` in `get_ref`.

15
how_to/Change_34.md Normal file
View File

@@ -0,0 +1,15 @@
- data: Don't always dereference refs (for `ugit k`)
Actually, it's not always desirable to dereference a ref all the way. Sometimes
we would like to know at which ref a symbolic ref points, rather than the final
OID. Or we would like to update a ref directly, rather then updating the last
ref in the chain.
One such usecase is `ugit k`. When visualizing refs it would be nice to see
which ref points to which ref. We will see another usecase soon.
To accomodate this, we will add a `deref` option to `get_ref`, `iter_refs` and
`update_ref`. If they will be called with `deref=False`, they will work on the
raw value of a ref and not dereference any symbolic refs.
Then we will update `k` to use `deref=False`.

View File

@@ -1,8 +1,9 @@
import itertools
import operator
import os
import string
from collections import namedtuple
from collections import deque, namedtuple
from pathlib import Path, PurePath
from . import data
@@ -83,7 +84,7 @@ def read_tree(tree_oid):
def commit(message):
commit = f"tree {write_tree()}\n"
HEAD = data.get_HEAD()
HEAD = data.get_ref("HEAD").value
if HEAD:
commit += f"parent {HEAD}\n"
@@ -92,15 +93,23 @@ def commit(message):
oid = data.hash_object(commit.encode(), "commit")
data.set_HEAD(oid)
data.update_ref("HEAD", data.RefValue(symbolic=False, value=oid))
return oid
def create_tag(name, oid):
data.update_ref(f"refs/tags/{name}", data.RefValue(symbolic=False, value=oid))
def checkout(oid):
commit = get_commit(oid)
read_tree(commit.tree)
data.set_HEAD(oid)
data.update_ref("HEAD", data.RefValue(symbolic=False, value=oid))
def create_branch(name, oid):
data.update_ref(f"refs/heads/{name}", data.RefValue(symbolic=False, value=oid))
Commit = namedtuple("Commit", ["tree", "parent", "message"])
@@ -124,5 +133,44 @@ def get_commit(oid):
return Commit(tree=tree, parent=parent, message=message)
def iter_commits_and_parents(oids):
oids = deque(oids)
visited = set()
while oids:
oid = oids.popleft()
if not oid or oid in visited:
continue
visited.add(oid)
yield oid
commit = get_commit(oid)
# Return parent next
oids.appendleft(commit.parent)
def get_oid(name):
if name == "@":
name = "HEAD"
# Name is ref
refs_to_try = [
f"{name}",
f"refs/{name}",
f"refs/tags/{name}",
f"refs/heads/{name}",
]
for ref in refs_to_try:
if data.get_ref(ref, deref=False).value:
return data.get_ref(ref).value
# Name is SHA1
is_hex = all(c in string.hexdigits for c in name)
if len(name) == 40 and is_hex:
return name
assert False, f"Unknown name {name}"
def is_ignored(path):
return ".ugit" in path.split("/")

View File

@@ -1,4 +1,5 @@
import argparse
import subprocess
import sys
import textwrap
@@ -19,6 +20,8 @@ def parse_args():
commands = parser.add_subparsers(dest="command")
commands.required = True
oid = base.get_oid
init_parser = commands.add_parser("init")
init_parser.set_defaults(func=init)
@@ -28,14 +31,14 @@ def parse_args():
cat_file_parser = commands.add_parser("cat-file")
cat_file_parser.set_defaults(func=cat_file)
cat_file_parser.add_argument("object")
cat_file_parser.add_argument("object", type=oid)
write_tree_parser = commands.add_parser("write-tree")
write_tree_parser.set_defaults(func=write_tree)
read_tree_parser = commands.add_parser("read-tree")
read_tree_parser.set_defaults(func=read_tree)
read_tree_parser.add_argument("tree")
read_tree_parser.add_argument("tree", type=oid)
commit_parser = commands.add_parser("commit")
commit_parser.set_defaults(func=commit)
@@ -43,11 +46,24 @@ def parse_args():
log_parser = commands.add_parser("log")
log_parser.set_defaults(func=log)
log_parser.add_argument("oid", nargs="?")
log_parser.add_argument("oid", default="@", type=oid, nargs="?")
checkout_parser = commands.add_parser("checkout")
checkout_parser.set_defaults(func=checkout)
checkout_parser.add_argument("oid")
checkout_parser.add_argument("oid", type=oid)
tag_parser = commands.add_parser("tag")
tag_parser.set_defaults(func=tag)
tag_parser.add_argument("name")
tag_parser.add_argument("oid", default="@", type=oid, nargs="?")
branch_parser = commands.add_parser("branch")
branch_parser.set_defaults(func=branch)
branch_parser.add_argument("name")
branch_parser.add_argument("start_point", default="@", type=oid, nargs="?")
k_parser = commands.add_parser("k")
k_parser.set_defaults(func=k)
return parser.parse_args()
@@ -80,16 +96,47 @@ def commit(args):
def log(args):
oid = args.oid or data.get_HEAD()
while oid:
for oid in base.iter_commits_and_parents({args.oid}):
commit = base.get_commit(oid)
print(f"commit {oid}\n")
print(textwrap.indent(commit.message, " "))
print("")
oid = commit.parent
def checkout(args):
base.checkout(args.oid)
def tag(args):
base.create_tag(args.name, args.oid)
def branch(args):
base.create_branch(args.name, args.start_point)
print(f"Branch {args.name} created at {args.start_point[:10]}")
def k(args):
dot = "digraph commits {\n"
oids = set()
for refname, ref in data.iter_refs(deref=False):
dot += f"'{refname}' [shape=note]\n"
dot += f"'{refname}' -> '{ref.value}'\n"
if not ref.symbolic:
oids.add(ref.value)
for oid in base.iter_commits_and_parents(oids):
commit = base.get_commit(oid)
dot += f"'{oid}' [shape=box style=filled label='{oid[:10]}']\n"
if commit.parent:
dot += f"'{oid}' -> '{commit.parent}'\n"
dot += "}"
print(dot)
with subprocess.Popen(
["dot", "-Tgtk", "/dev/stdin"], stdin=subprocess.PIPE
) as proc:
proc.communicate(dot.encode())

View File

@@ -1,6 +1,9 @@
from pathlib import Path
from pathlib import Path, PurePath
import hashlib
import os
from collections import namedtuple
GIT_DIR = ".ugit"
@@ -10,15 +13,45 @@ def init():
Path.mkdir(f"{GIT_DIR}/objects")
def set_HEAD(oid):
with open(f"{GIT_DIR}/HEAD", "w") as f:
f.write(oid)
RefValue = namedtuple("RefValue", ["symbolic", "value"])
def get_HEAD():
if Path.is_file(f"{GIT_DIR}/HEAD"):
with open(f"{GIT_DIR}/HEAD") as f:
return f.read().strip()
def update_ref(ref, value, deref=True):
assert not value.symbolic
ref = _get_ref_internal(ref, deref)[0]
ref_path = f"{GIT_DIR}/{ref}"
Path.mkdir(ref_path, exist_ok=True)
with open(ref_path, "w") as f:
f.write(value.value)
def get_ref(ref):
return _get_ref_internal(ref)[1]
def _get_ref_internal(ref):
ref_path = f"{GIT_DIR}/{ref}"
value = None
if Path.is_file(ref_path):
with open(ref_path) as f:
value = f.read().strip()
symbolic = bool(value) and value.startswith("ref")
if symbolic:
value = value.split(":", 1)[1].strip()
return _get_ref_internal(value)
return ref, RefValue(symbolic=False, value=value)
def iter_refs():
refs = ["HEAD"]
for root, _, filenames in Path.walk(f"{GIT_DIR}/refs"):
root = PurePath.relative_to(root, GIT_DIR)
refs.extend(f"{root}/{name}" for name in filenames)
for refname in refs:
yield refname, get_ref(refname)
def hash_object(data, type_="blob"):