Introduction to git and github for beginners

This is the first addition to the “In 30 minutes” series where the aim is to make you familiar with any new technology in the quickest way possible without making you go through a lot of resources.

Basic Terms


  • Repository : A repository is the root folder of your project that stores all your project files, and all the versions of its files that git saves.
  • Star : It is used to show appreciation to the repository maintainer for their work and bookmark the repository for easier access.
  • Watch : It is used to get notifications for any new pull requests and issues that are created, including those not mentioning you.
  • Fork : A fork creates a copy of the repository to be managed by you, independent of the parent project.
  • Commit : A commit is a saved state or snapshot of your repository. It is a 40 character hash to differentiate between your current and previous project snapshot.

Configuration


Git keeps track of changes in three forms:

1. Unmodified : Files that are in a git repository and hasn't been changed.
2. Modified : Files that are changed in a git repository.
3. Staged : Files that will be included in the next commit.

Below are the 3 configuration level available in Git:

-------------------------------------------
 Level: Config file location : Description
-------------------------------------------
1. System: /etc/gitconfig : This is for the whole system all users.
2. User: ~/.gitconfig : User specific configurations goes in here.
3. Project: /.git/config : Configurations that are project specific is mentioned in this file.

Commands to configure:

git config --system/--global/--local(default) [option to configure] [value] : For system/user/project level configuration changes.

Architecture & Work Flow


Traditional Source Code Manager/SubVersioning approach:

            [repository ]
commit->    Ʌ           |
            |           V  <-checkout
            -->[Work]<---

We have one remote project folder(work) and one local project folder(work). We can checkout the files from the remote folder or we can commit to the remote folder. This approach had a big disadvantage that is after sometime your remote and local files will get out of sync. Specially in a shared working environment. This approach is also called The Two Tree architecture.

Git’s Approach:

                    [repository ]
git commit->        Ʌ           V
                    | [Staging] |  <-checkout
git add files->     Ʌ           V
                    -->[Work]<---

Unlike the two tree architecture, Git uses what is called The Three Tree architecture.

It still has the repository and the working copies, but in between is another tree which is the staging index.git add adds our files to the staging index, and then from there, we commit.

It is important that we understand that this is part of the architecture of Git, and it’s a really nice feature. Because then what it means is that we can make changes to ten different files in our working copy and then we can say, all right, I am ready to make a commit, but I don’t want to commit all ten of those, I just want to commit five of these as one changed set. So what I am going to do is I am going to put those on the staging index, add them to the staging index, get those five files ready to go, and as soon as I am satisfied that they are ready, now I will commit those five files in one changed set to the repository. The other five files are still saved in my working tree, but they never got added to the staging index or to the repository. They are sitting there waiting for me to make another commit, to stage those changes and then commit them to the repository.

And of course, we can pull things out of the repository in the same way. It’s possible to pull them from the repository to the staging index, from the staging index to the working directory, usually, that’s not what we do.

Example:

                        [repository ]
git commit -m ""->      Ʌ           V
                        | [Staging] |  <- git pull File.txt
git add File.txt->      Ʌ           V
                        --<[Work]<--- File.txt

Git commits are the SHA-1 checksums of all the staged files, that results in a 40 character hexadecimal number.

Git commit architecture:

Every commit has 4 crucial peices of information attached to it. That are:

  1. commit message : It is the description of changes that you made in the commit.
  2. Author : It holds the identity, i.e username and email address of the user who made the commit.
  3. Parent commit HEAD : The previous commit hash over which this commit is being done.
  4. Commit hash/Commit HEAD : The hash for the newly created commit.

     ________________________      ________________________      ________________________
    |SHA-1 Hash[Commit HEAD] |    |SHA-1 Hash[Commit HEAD] |    |SHA-1 Hash[Commit HEAD] |
    |_________73fa0c_________|    |_________c9fd43_________|    |_________ae67f1_________|
    |Parent Commit HEAD      |    |Parent Commit HEAD      |    |Parent Commit HEAD      |
    |_________c9fd43_________|    |_________ae67f1_________|    |_________nil____________|
    |author                  |    |author                  |    |author                  |
    |_________Jon Doe________|    |_________Jon Doe________|    |_________Jon Doe________|
    |commit message          |    |commit message          |    |commit message          |
    |_________Change 2_______|    |_________Change 1_______|    |_________Initial Commit_|
    
        Snapshot v3                    Snapshot v2                    Snapshot v1
    

HEAD Pointer : Pointer to the tip of current working directory i.e it is the last committed state of the repository when the last checkout happened. So, basically what it contains is the current commit hash or the next commit’s parent hash. For the above commit structure, 73fa0c is the head pointer.

It is stored in each repository under .git/HEAD file which points to .git/refs/heads/master : For the master branch and similarly for the other branches too.

Changing last Git commit:

git commit --amend -m "" <- This will overwrite the last commit hash with a new one pointing HEAD to the new hash.

Changing other Git commits:

1. git checkout [SHA HASH of the commit you want to revert to] <filename>/<branch name>/<commit>
2. git revert <SHA HASH of the commit you want to revert to> <- Use "[-n]" if you do not want to commit the revert, i.e changes will be made but
                                                                 it will be left to you to commit those changes later.
3. git reset [mode] "SHA HASH" **EXTREMELY DANGEROUS**
          "--soft" <- Will move the HEAD pointer to the specified point but would not change the staging index & anything in the working
                        directory.
          "--mixed(default)" <- Apart from above, it also changes the staging index to match the repository, but working directory stays intact.
          ********************************************BIG RED is NEXT*********************************************************************
          "--hard" <- Apart from above, it also changes everything in the working directory and everything else till the HEAD pointer commit.

Making a new repository

  1. Make a new folder and navigate to it.
  2. Once in that folder, issue git init. What this command will do is, it will initialize that folder with a .git a folder having all the necessary config files in it.
  3. After that, you will need to copy your project files in that folder. You can do that in many ways either by using GUI or by using the terminal. Once the files have been copied you need to add them to the stagged list by using git add [filename/wildcard].
  4. Once you are ready to publish your work, do a final commit. After which you will Leed to link your local and remote repository which you can do by using git remote add [branchalias] [remote-repository-url.git].

    git init <- Used to initialize a repository i.e to change a folder into the repository, which simply means that git will add a `.git` the folder in it and add to its tracked folder list.
    
    git status <- Command used to show the status of current branch i.e which files have been modified and which files have been staged for the next commit.
    
    git add <filename/directory_path> <- This command is used to move a file from modified level to staged level or staging index.
    
    git commit <- Command used to record the changed in all files found in the staged level or staging index. This will open a text editor in
                    which you will have to enter the commit message and description explaining what the commit is about.
    git commit -m "commit title goes in here" <- For quick commit with the message passed along.
    git commit -am "" <- To add all the modified files to staged level or staging index and commit the changes.
    

Branching


git branch [nameofbranch] [branchparentsource] <- Create a new branch, but this would not switch to that branch.
git branch <- List all local branches of the current repository.
git branch -r <- List all remote branches of the current repository.
git branch -a <- List all branches of the current repository.
git branch -[d/D] [nameofbranch][branchparentsource] <- It is used to delete a branch
                                                        "-d" is used as a safety measure option and will check for branch merge first.
                                                        "-D" will forcefully delete the branch irrespective of anything.
git branch --merged <- To show which branches are completely included or merged with the branch you are on.
git branch [nameofbranch] --no-merge <- Opposite of above.
git branch -m [old_name] [new_name] <- Rename a branch.

git checkout [nameofbranch] <- Switch to another branch.
git checkout -b [nameofbranch] <- Create and switch to a new branch.
git checkout -- nameoffile <- Discard the file from being tracked in the current branch.

Branch Merging

git checkout [nameofbranch] <- Always before you issue the merge command, checkout to the branch you wish to merge "into"(the receiver branch).
git merge [nameofbranch] <- This is the actual merge command, nameofbranch is the branch that you wish to merge into the current branch.

Concept of fast forward: If the HEAD of receiving branch is found in any of the commit’s parent in the branch to be merged. A fast forward merge happens i.e a new merge commit doesn’t take place, instead, the branch to be merged’s last commit is moved to the tip of the receiving branch and the HEAD pointer is set to the new commit.

git merge --no-ff [branchname] <- Disables the Fast Forward merging.
git merge --ff-only [branchname] <- Enforces the Fast Forward merging and merge fails if FastForward can't happen.

Sperate commit strategy:

git merge --abort <- to stop a merge after a conflict happens.

After a merge conflict happens:

  1. Resolve the conflict manually.
  2. git add [filename]
  3. git commit <- at this point, a “real conflicted merge” will finish.

Remote repository - GitHub

push : As the name suggests, it means to push files upstream from local repository to remote repository(uploading of files).

origin/master : It is the default alias given to the remote repository’s master branch on your local system and vice-versa.

fetch : As the name suggests, it means to get the details from the remote repository and sync it with our local system(origin/master).

git remote add [repository_alias] [link_to_repository.git] <- Link a remote repository with our local repository. ".git" is a link you will
                                                              get from your Github account and repository_alias will be a dummy alias name
                                                              (default is origin) that you will give to the remote repository inorder to isolate it from your offline repository on the local system.
git remote <- List all the added remote repository aliases.
git remote rm [repository_alias] <- To remove a remote repository.

git fetch [repository_alias] <- Used to sync remote repository and local repository. After this, you generally will have to merge the
                                origin/master branch with your local repository master branch to get the latest copy of the files from the remote repository OR you can do the following.

git pull [url_of_remote_repository/repository_alias master] <- It is a combination of git fetch and git merge origin/master master

git clone [url_of_remote_repository] <- This will download a third party repository/project that you wish to work on.
                                        "-b" <- to specify a particular branch of the repository.

git push -u [branchalias] [localbranch][:][remotebranch] <- If no value is given on left side of the colon it will make git delete that remote
                                                            branch, instead of this method --delete [remotebranch] can also be used.
                                                            The whole [localbranch][:] is purely optional.

Misc commands


git diff <- We can use this command to see the difference between almost anything e.g branches, commits, etc. But this would not show staged files.
git diff --staged <- This option will be needed to show the difference between staged files.
git diff --color-words <- This option will highlight only the changed happened in the files mentioned.

git log <- To see all the commits and their details till date for all the repositories.
git log -n [no. of commits] <- To specify no. of commits you want to see, starting from the latest one.
git log --since=yyyy-mm-dd[or other time stamps] <- To see commits after a certain date.
git log --until=yyyy-mm-dd[or other time stamps] <- To see commits before a certain date.
git log --author="Name" <- To see commit based on particular commit author.
git log --grep="RegEx" <- To search for something specific through the logs.

git help <- To get help on any specific command of git(git version of manpages).
git rm [filename] <- To remove/delete a particular file.
git rm --cached [filename] <- to remove a file from being tracked(staging index).    
git mv [filename] <- To rename a particular file.
git reset HEAD [filename] <- To remove a file from the staging index.
git clean [-n] [-f] <- To remove untracked or unstagged files.

git config --global core.excludesfile [path of .gitignore file] <- To set a ".gitignore" file rules for all the repositorys for a particular user.
                                                                    We need no to specify an individual ".gitignore" file for all the repositories, but that is a good practice.

.gitignore : A specific file that is used to exclude certain files that are matching to the rules mentioned in it. It can take values such as a list of files, *,?,[a-z0-9],!, a path of dir. Generally, it is found in every repository with settings exclusive for that repository.

blob : Any file in a repository.

tree : Any dir in a repository.

tree-ish : Reference to a commit, can be SHA-1 HASH(any identifiable length), ancestry, HEAD, branch reference.

^ : single parent commit i.e one for parent, two for grandparent, three for great-grandparent. Used in “tree-ishes”

~ : level of generation i.e ~ for parent, ~2 for grandparent, ~3 for great-grandparent and so on. Used in “tree-ishes”

git ls-tree <tree-ish> <- to get the listing of a tree.
Share Comments
comments powered by Disqus