Compare commits

...

19 Commits

Author SHA1 Message Date
c484b41a89 Don't always derefence ref 2024-07-06 19:50:52 +02:00
fe5ed910a3 Add change 34 to instructions 2024-07-06 19:49:59 +02:00
9c53919802 Dereference refs when reading and writing 2024-06-29 18:48:21 +02:00
30ce8c84e4 Add change 33 to instructions 2024-06-29 18:47:23 +02:00
6841a97d18 Create RefValue container 2024-06-14 16:29:58 +02:00
556c16c081 Add change 32 to instructions 2024-06-14 16:29:25 +02:00
7a0f86e49b Implement symbolic refs idea 2024-06-09 21:20:53 +02:00
3770c81942 Implement symbolic refs idea 2024-06-09 21:20:38 +02:00
9f8fde3c60 Create new branch 2024-06-05 20:18:40 +02:00
772f631768 Add change 30 to instructions 2024-06-05 20:18:06 +02:00
7fe3e0f497 Use iter_commits_and_parents 2024-05-24 14:45:06 +02:00
7896b80c42 Add change 29 to instructions 2024-05-24 14:44:43 +02:00
b854b4fa18 Render graph 2024-05-22 16:52:44 +02:00
2362d69673 Add change 28 to instructions 2024-05-22 16:52:14 +02:00
d53322c256 Iterate commits and parents 2024-05-16 12:01:30 +02:00
7fbf6640f6 Iterate commits and parents 2024-05-16 11:54:52 +02:00
dad9077515 Add change 27 to instructions 2024-05-16 11:54:28 +02:00
db7d608010 Print refs for k (visualization tool) 2024-05-15 11:01:19 +02:00
c9d8b443ed Add change 26 to instructions 2024-05-15 10:59:54 +02:00
12 changed files with 290 additions and 15 deletions

14
how_to/Change_26.md Normal file
View File

@@ -0,0 +1,14 @@
- k: Print refs
Now that we have refs and a potentially branching commit history, it's a good
idea to create a visualization tool to see all the mess that we've created.
The visualization tool will draw all refs and all the commits pointed by the refs.
Our command to run the tool will be called `ugit k`, similar to `gitk` (which is
a graphical visualization tool for Git).
We'll create a new `k` command in *cli.py*. We'll create `iter_refs` which is a
generator which will iterate on all available refs (it will return HEAD from the
ugit root directory and everything under *.ugit/refs*). As a first step, let's
just print all refs when running `k`.

21
how_to/Change_27.md Normal file
View File

@@ -0,0 +1,21 @@
- k: Iterate commits and parents
In addition to printing the refs, we'll also print all OIDs that are reachable
from those refs. We'll create `iter_commits_and_parents`, which is a generator
that returns all commits that it can reach from a given set of OIDs.
Note that `iter_commits_and_parents` will return an OID once, even if it's
reachable from multiple refs. Here, for example:
```
o<----o<----o<----o<----@<----@<----@
^ \ ^
first commit -<--$<----$ refs/tags/tag1
^
refs/tags/tag2
```
We can reach the first commit by following the parents of *tag1* or by following
the parents of *tag2*. Yet if we call `iter_commits_and_parents({tag1, tag2})`,
the first commit will be yielded only once. This property will be useful later.
(Note that nothing is visualized yet, we're preparing for that.)

18
how_to/Change_28.md Normal file
View File

@@ -0,0 +1,18 @@
- k: Render graph
`k` is supposed to be a visualization tool, but so far we've just printed a
bunch of OIDs... Now comes the visualization part!
There's a convenient file format called "dot" that can describe a graph. This is
a textual format. We'll generate a graph of all commits and refs in dot format
and then visualize it using the "dot" utility that comes with Graphviz.
(If you're unfamiliar with dot or Graphviz please look it up online.)
The graph will contain a node for each commit, that points to the parent commit.
The graph will also contain a node for each ref, which points to the relevant
commit.
At this point, `ugit k` is fully functional and I encourage you to play with it.
Create a crazy branching history and a bunch of tags and see for yourself that
`ugit k` can draw all that visually.

9
how_to/Change_29.md Normal file
View File

@@ -0,0 +1,9 @@
- log: Use `iter_commits_and_parents`
Refactoring ahead! Since we have `iter_commits_and_parents` from `k`, let's also
use this function in `log`. We'll need to adjust it a bit to use
`collections.deque` instead of a set so that the order of commits is deterministic.
This generalization might seem unneeded at this point, but it will be useful
later. (Note for the advanced folks: When we implement merge commits that have
multiple parents, this generic way to iterate will come in handy.)

82
how_to/Change_30.md Normal file
View File

@@ -0,0 +1,82 @@
- branch: Create new branch
Tags were an improvement since they freed us from the burden of remembering OIDs
directly. But they are still somewhat inconvenient, since they are static. Let
me illustrate:
```
o-----o-----o-----o-----o-----o-----o
\ ^
----o-----o tag2,HEAD
^
tag1
```
If we have the above situation, we can easily flip between *tag1* and *tag2* with
`checkout`. But what happens if we do
- ugit checkout tag2
- Make some changes
- ugit commit?
Now it looks like this:
```
o-----o-----o-----o-----o-----o-----o-----o
\ ^ ^
----o-----o tag2 HEAD
^
tag1
```
The upper branch has advanced, but *tag2* still points to the previous commit.
This is by design, since tags are supposed to just name a specific OID. So if we
want to remember the new HEAD position we need to create another tag.
But now let's create a ref that will "move forward" as the branch grows. Just
like we have `ugit tag`, we'll create `ugit branch` that will point a branch to
a specific OID. This time the ref will be created under *refs/heads*.
At this stage, `branch` doesn't look any different from tag (the only difference
is that the branch is created under *refs/heads* rather than *refs/tags*). But
the magic will happen once we try to `checkout` a branch.
So far when we checkout anything we update HEAD to point to the OID that we've
just checked out. But if we checkout a branch by name, we'll do something
different, we will update HEAD to point to the **name of the branch!** Assume
that we have a branch here:
```
o-----o-----o-----o-----o-----o-----o
\ ^
----o-----o tag2,branch2
^
tag1
```
Running `ugit checkout branch2` will create the following situation:
```
o-----o-----o-----o-----o-----o-----o
\ ^
----o-----o tag2,branch2 <--- HEAD
^
tag1
```
You see? HEAD points to *branch2* rather than the OID of the commit directly.
Now if we create another commit, ugit will update HEAD to point to the latest
commit (just like it does every time) but as a side effect it will also update
*branch2* to point to the latest commit.
```
o-----o-----o-----o-----o-----o-----o-----o
\ ^ ^
----o-----o tag2 branch2 <--- HEAD
^
tag1
```
This way, if we checkout a branch and create some commits on top of it, the ref
will always point to the latest commit.
But right now HEAD (or any ref for that matter) may only point to an OID. It
can't point to another ref, like I described above. So our next step would be
to implement this concept. To mirror Git's terminology, we will call a ref that
points to another ref a "symbolic ref". Please see the next change for an
implementation of symbolic refs.

5
how_to/Change_31.md Normal file
View File

@@ -0,0 +1,5 @@
- data: Implement symbolic refs idea
If the file that represents a ref contains an OID, we'll assume that the ref
points to an OID. If the file contains the content `ref: <refname>`, we'll
assume that the ref points to `<refname>` and we will dereference it recursively.

8
how_to/Change_32.md Normal file
View File

@@ -0,0 +1,8 @@
- data: Create Refvalue container
To make working with symbolic refs easier, we will create a `Refvalue` container
to represent the value of a ref. `Refvalue` will have a property symbolic that
will say whether it's a symbolic or a direct ref.
This change is just refactoring, we will wrap every OID that is written or read
from a ref in a `RefValue`.

17
how_to/Change_33.md Normal file
View File

@@ -0,0 +1,17 @@
data: Dereference refs when reading and writing
Now we'll dereference symbolic refs not only when reading them but also when
writing them.
We'll implement a helper function called `_get_ref_internal` which will return
the path and the value of the last ref pointed by a symbolic ref. In simple words:
- When given a non-symbolic ref, `_get_ref_internal` will return the ref name
and value.
- When given a symbolic ref, `_get_ref_internal` will dereference the ref
recursively, and then return the name of the last (non-symbolic) ref that points
to an OID, plus its value.
Now `update_ref` will use `_get_ref_internal` to know which ref it needs to update.
Additionally, we'll use `_get_ref_internal` in `get_ref`.

15
how_to/Change_34.md Normal file
View File

@@ -0,0 +1,15 @@
- data: Don't always dereference refs (for `ugit k`)
Actually, it's not always desirable to dereference a ref all the way. Sometimes
we would like to know at which ref a symbolic ref points, rather than the final
OID. Or we would like to update a ref directly, rather then updating the last
ref in the chain.
One such usecase is `ugit k`. When visualizing refs it would be nice to see
which ref points to which ref. We will see another usecase soon.
To accomodate this, we will add a `deref` option to `get_ref`, `iter_refs` and
`update_ref`. If they will be called with `deref=False`, they will work on the
raw value of a ref and not dereference any symbolic refs.
Then we will update `k` to use `deref=False`.

View File

@@ -3,7 +3,7 @@ import operator
import os
import string
from collections import namedtuple
from collections import deque, namedtuple
from pathlib import Path, PurePath
from . import data
@@ -84,7 +84,7 @@ def read_tree(tree_oid):
def commit(message):
commit = f"tree {write_tree()}\n"
HEAD = data.get_ref("HEAD")
HEAD = data.get_ref("HEAD").value
if HEAD:
commit += f"parent {HEAD}\n"
@@ -93,19 +93,23 @@ def commit(message):
oid = data.hash_object(commit.encode(), "commit")
data.update_ref("HEAD", oid)
data.update_ref("HEAD", data.RefValue(symbolic=False, value=oid))
return oid
def create_tag(name, oid):
data.update_ref(f"refs/tags/{name}", oid)
data.update_ref(f"refs/tags/{name}", data.RefValue(symbolic=False, value=oid))
def checkout(oid):
commit = get_commit(oid)
read_tree(commit.tree)
data.update_ref("HEAD", oid)
data.update_ref("HEAD", data.RefValue(symbolic=False, value=oid))
def create_branch(name, oid):
data.update_ref(f"refs/heads/{name}", data.RefValue(symbolic=False, value=oid))
Commit = namedtuple("Commit", ["tree", "parent", "message"])
@@ -129,6 +133,22 @@ def get_commit(oid):
return Commit(tree=tree, parent=parent, message=message)
def iter_commits_and_parents(oids):
oids = deque(oids)
visited = set()
while oids:
oid = oids.popleft()
if not oid or oid in visited:
continue
visited.add(oid)
yield oid
commit = get_commit(oid)
# Return parent next
oids.appendleft(commit.parent)
def get_oid(name):
if name == "@":
name = "HEAD"
@@ -141,8 +161,8 @@ def get_oid(name):
f"refs/heads/{name}",
]
for ref in refs_to_try:
if data.get_ref(ref):
return data.get_ref(ref)
if data.get_ref(ref, deref=False).value:
return data.get_ref(ref).value
# Name is SHA1
is_hex = all(c in string.hexdigits for c in name)

View File

@@ -1,4 +1,5 @@
import argparse
import subprocess
import sys
import textwrap
@@ -56,6 +57,14 @@ def parse_args():
tag_parser.add_argument("name")
tag_parser.add_argument("oid", default="@", type=oid, nargs="?")
branch_parser = commands.add_parser("branch")
branch_parser.set_defaults(func=branch)
branch_parser.add_argument("name")
branch_parser.add_argument("start_point", default="@", type=oid, nargs="?")
k_parser = commands.add_parser("k")
k_parser.set_defaults(func=k)
return parser.parse_args()
@@ -87,16 +96,13 @@ def commit(args):
def log(args):
oid = args.oid
while oid:
for oid in base.iter_commits_and_parents({args.oid}):
commit = base.get_commit(oid)
print(f"commit {oid}\n")
print(textwrap.indent(commit.message, " "))
print("")
oid = commit.parent
def checkout(args):
base.checkout(args.oid)
@@ -104,3 +110,33 @@ def checkout(args):
def tag(args):
base.create_tag(args.name, args.oid)
def branch(args):
base.create_branch(args.name, args.start_point)
print(f"Branch {args.name} created at {args.start_point[:10]}")
def k(args):
dot = "digraph commits {\n"
oids = set()
for refname, ref in data.iter_refs(deref=False):
dot += f"'{refname}' [shape=note]\n"
dot += f"'{refname}' -> '{ref.value}'\n"
if not ref.symbolic:
oids.add(ref.value)
for oid in base.iter_commits_and_parents(oids):
commit = base.get_commit(oid)
dot += f"'{oid}' [shape=box style=filled label='{oid[:10]}']\n"
if commit.parent:
dot += f"'{oid}' -> '{commit.parent}'\n"
dot += "}"
print(dot)
with subprocess.Popen(
["dot", "-Tgtk", "/dev/stdin"], stdin=subprocess.PIPE
) as proc:
proc.communicate(dot.encode())

View File

@@ -1,6 +1,9 @@
from pathlib import Path
from pathlib import Path, PurePath
import hashlib
import os
from collections import namedtuple
GIT_DIR = ".ugit"
@@ -10,18 +13,45 @@ def init():
Path.mkdir(f"{GIT_DIR}/objects")
def update_ref(ref, oid):
RefValue = namedtuple("RefValue", ["symbolic", "value"])
def update_ref(ref, value, deref=True):
assert not value.symbolic
ref = _get_ref_internal(ref, deref)[0]
ref_path = f"{GIT_DIR}/{ref}"
Path.mkdir(ref_path, exist_ok=True)
with open(ref_path, "w") as f:
f.write(oid)
f.write(value.value)
def get_ref(ref):
return _get_ref_internal(ref)[1]
def _get_ref_internal(ref):
ref_path = f"{GIT_DIR}/{ref}"
value = None
if Path.is_file(ref_path):
with open(ref_path) as f:
return f.read().strip()
value = f.read().strip()
symbolic = bool(value) and value.startswith("ref")
if symbolic:
value = value.split(":", 1)[1].strip()
return _get_ref_internal(value)
return ref, RefValue(symbolic=False, value=value)
def iter_refs():
refs = ["HEAD"]
for root, _, filenames in Path.walk(f"{GIT_DIR}/refs"):
root = PurePath.relative_to(root, GIT_DIR)
refs.extend(f"{root}/{name}" for name in filenames)
for refname in refs:
yield refname, get_ref(refname)
def hash_object(data, type_="blob"):