Git Basics

How does it store information

  • Git is a distributed version control system.

  • At its core, git is like a key-value store.

  • The value is data and the key is the hash of data

  • The key to retrieving the content

The Key

  • It is a SHA1, the cryptographic hash function.

  • Given a piece of data, it produces a 40-digit hexadecimal number.

  • This value should always be the same if the given input is the same.

The Value

Git stores the compressed data in a blob and metadata in a header. The header consists of the following:

  • the identifier blob

  • the size of the content

  • \0 delimiter

  • content

Git hash-object

Under the hood, git has this git hash-object command to generate SHA1 from the value. For example:

echo 'Nitin Raturi' | git hash-object --stdin
#44b57a455c8b6a98f89496092f33268b9def5750

You can try this and you will the get sha1 string.

Data storage

Git stores the data inside the .git directory, it is initialized when we run git init command. Inside the .git directory, blobs (that contain SHA1 and other metadata) are stored inside the .git/objects directory.

Git stores information in the form of a tree that contains pointers to blobs, other trees, and metadata such as type of pointer (blob or tree), filename or directory name, and mode.

Commit Object

A commit points to a tree and contains metadata such as author and committer, data, message, and parent commit (one or more). The sha1 of the commit is the hash of all this information.

Useful commands for this section

git cat-file -t <commit_id>
git cat-file -p <commit_id>
git log --oneline
git --no-pager log --oneline

Git Areas

There are 4 areas in git where your code lives: a repository, a working area, a staging area, and a stash.

  • Working Area: The files in this area are not handled by git and are also not in the staging area. These files are also called untracked files.

  • Repository: It contains all the files and the commits git knows about.

  • Staging Area: The staging area is how git knows what will change between the current commit and the next commit.

  • Stash Area: This area is useful to save un-committed work. For example, you need to check out a branch in the middle of work or reset. Useful commands:

#Moving in and out files of stashing area
git add <file>
git rm <file>
git mv <file>
git add -p

#Stash basic use
git stash
git stash list
git stash show
git stash show stash@{0}
git stash apply
git stash apply stash@{0}
git stash --include-untracked
git stash --all
git stash save "stash name"
git stash branch <optional branch name>
git checkout <stash name> -- <filename>
git stash pop
git stash drop
git stash drop stash@{n}
git stash clear

References

References in git are pointers to commits such as tags, branches, or HEAD.

  • Branch: Pointer to a particular commit.

  • Head: This tells git what is your current branch and points at the name of the current branch.

  • Tag: Pointer to a commit.

git tag <tagname>
git tag -a <v1.0> -m "message"
git show <v1.0>
git tag
git show-ref --tags
git tag --points-at <commit>
git show <tagname>

Merge

Merge commits are commits that have more than one parent. For merging we generally use git merge <branchname> command but this will not retain the merge history.

To retain the history we can use the git merge <branchname> --no-ff command.

Many times while emerging we encounter merge conflicts. If we have to solve same merge conflict multiple times, we can make use of git rerere.

Git rerere (reuse recorded solution) saves how you resolved a conflict. To use this you have to enable it.

git config rerere.enabled true #use --global for all projects

Not complete yet.