Introduction to Git and GitHub

Introduction to Git and GitHub

Hi everyone, today we will be looking into the concepts and some of the most often used commands in Git. We will also cover GitHub, from how to use GitHub to how to navigate GitHub and how to contribute to all the projects. So, let us get started.

So before diving deeper into learning Git, we need to understand what is Git?

Git is a ‘Version Control’ for your application. Ok so now you might be wondering what is meant by “Version Control”?

So, imagine that you have an application to which 100s of people are contributing. There might be a time when your application stops working. wouldn't it be great if you had a snapshot of the code base when your application was running perfectly?

That’s when the concept of Git comes into play. It stores the history of your project; it also shows you when and where the changes were made along with who made the changes.

Now as you have some knowledge about Git you might also be wondering how people share the code that they worked on, on an application. That’s what GitHub is for. GitHub is a place where you can share your project code and developers from all over the world can work on it.

So quick rundown,

  • What is Git? -> A version control that lets you save the history of your project

  • What is GitHub -> A place where you can share/contribute to various open-source projects.

Now let's dive deeper into the commands of Git

Git commands and concepts

Initialize a Git repository

Navigate to the folder on your local machine to initialize a new Git repository(repositories are just folders) use

git init

an alternative is

git init <directory>

this command can be used if you are not in the folder you want to initialize. you can just add the directory to initialize it from outside of the directory.

A quick rundown of what we just learned.

git init (Initalize a local Git repository)

git init <directory> (Initialize a local Git repository from outside directory)

Staging your changes

Ok so now we have learned how to initialize a Git repository, but what do you think about whether this is visible to other people?

The answer is no. Because we haven't put our changes into a staging area.

git status

This command will give you the status of the tracked/untracked file whether it is changed/modified

Try making a new file and then type git status. You are likely to see a message in red color saying that your files are untracked. Now what that means is that nobody knows that these files exist in the project repository. Now to add your file you can use

git add .

OR 

git add <file name>

git add . means that all the files are now in the staging area so that a photo of them can be taken.

Ok so now what if you modify a file that is already staged? Do you have to stage it again?

The answer is yes. Now go ahead and type git status again and you are likely to see a message in green color saying that one of the already tracked files was modified and to add all those changes you need to again go for git add . .

Now let's just say that you made some files and also some changes to some existing files and did git add .. But wait, you weren't supposed to stage some file that you are currently working on. In that case to unstage a particular file but using

git restore --staged <file name>

This command will unstage a file that is already staged.

A quick rundown of what we just learned.

git status (List untracked files or tracked modified files)
git add . (Stage ALL changed files)
git add <file name> (Stage a particular file) 
git restore --staged <file name> (unstage a particular file)

Commiting all the added files

Ok so now you know how to stage your changes but now we need to click a picture of those staged changes and save it in the Git's history. To do that we use

git commit -m "<A meaningfull message describing your changes>"

After doing this a picture of all the files changed/added is taken. Now go ahead and type git status. You are likely to see a message like

On branch main (Or master)
nothing to commit, working tree clean

For now, don't worry about what Branches are, we will cover it in a bit.

But as you can see that all our staged changes are committed and a so-called picture of our changes is taken.

Now let's look at git log. git log shows you the history of all your commits in increasing order, which means new commits are displayed at the top and older commits are displayed at the bottom of the stack and each commit is built on top of each other. the history also consists of data, time, author and a long code that consists of some random letters and numbers called the hash id of a commit. To see your commit history just type.

git log

Now let's talk about a hypothetical situation where you deleted a file you weren't supposed to and then staged the changes with the help of git add . and then commit it with the help of git commit -m "<some message>". Now you want to reset this, what will you do? You can't do git restore --staged <file name> since the file is already committed. that's when git reset <hash id> comes to the rescue.

git reset <hash id>

BUT ALWAYS REMEMBER all commits are built on top of each other so if you want to un-commit(reset) a particular commit, you need to input the hash id of the commit below it. And now if you type git log you will only see the commit whose hash id you inputted at the top and all the commits below it are as it is.

But now you might be thinking that what happened to all those files that we just reset? They are now in the unstaged area and you can stage them as we do normally. So always be careful of all the files you are committing.

A quick rundown of what we just learned.

git commit -m "<A meaningfull message describing your changes>" (commits all the staged changes)

git log (view changes which are committed)

git log --summary (view changes in detailed manner)

git log --oneline (view changes in brief manner) 

git reset <hash id>

All about stash

In this segment, we will be looking into the concept of stash. This segment is connected with the previous one. so do keep that in mind.

So in the last segment, we talked about git reset <hash id> and how it brings the committed files to an unstaged area. Now there can be a scenario where you are working on a project in which you are working on some lines of code and you want to try out something on a clean code base. So basically you don't want all the unstaged files to be saved in a separate commit neither you want to lose all of those files and so you want to put all the unstaged files in some place and bring all of that back when need without making any commits or without saving its history. So this is when git stash comes into play.

So first you bring all the unstaged files to the staging area by using git add .. As all the files are now staged you can add them to the stash by using

git stash

Now go ahead and do git status will see

On branch main (Or master)
nothing to commit, working tree clean

Ok, so how do you bring those files or changes back?

git stash pop

git stash pop will bring all those files/modified files back to the unstaged area and you can then stage them and commit them as we already know how to do.

You can also clear a stash by using

git stash clear

CAUTION: once you use this command you will never get the file/changes which were in the stash back(ever). So use it responsibly.

A quick rundown of what we just learned.

git stash (Stash changes in a dirty working directory)

git stash pop (Bring back changes to unstaged area)

git stash clear (Remove all stashed entries)

Getting started with GitHub

Navigate to GitHub and make an account.

Now let's make a repository. After signing up on the home page, you will see

Now look at the top left side where you can see a New button is in green. Click on that and make a new repository.

In my case, I am making a private repository that only can be viewed by me. Click on Create repository.

Now we want to connect this GitHub repository to our project. You can do that by simply copying the URL of the repository.

Now once you have copied te URL of your repository go to your terminal and type

git remote add origin <repository link>

But only doing this won't push your changes to GitHub you also need to do

git push origin main

Now you might be confused about what is meant by push, remote, origin, main, add. Don't worry we will cover it right now. remote just means that we are working with URLs. add means that we are going to add a new URL. origin means the name of the URL, something like nick name or phone number, we normally don't remember all the phone numbers. main is just a branch that we will be talking about in the branch section. As I previously said by just doing git remote add origin <repository link> we can't push our code to GitHub, so that is why we use push.

A quick rundown of what we just learned.

git remote add origin <repository link> (Connect you project with the GitHub repository)

git push origin main (Push you code to GitHub)

What are branches?

Branching is one of the most important top in Git and it can be a little tricky to understand sometimes. And as a beginner, you might make some mistakes while working with branches.

So first let's visualize this, I am making a lot of commits

So as you can see when I am committing changes it resembles a branch-like structure. But by default, it is named main previously it was master.

But why use a branch?

Whenever you are working on some new feature or resolving a bug on some open-source repository always create a separate branch. Why?

Let's take an example of an open-source repository. In our case, we will be looking into the GitHub repository of Kubernetes. Don't worry you don't need to know what Kubernetes is we are just taking it as an example for a GitHub repository.

Now as you can see there are a bunch of files and folders and if you notice this is all in the main branch. These are the features used by people currently. So if we just start working on the main branch we might add some features which are not production ready yet or might have some bug that could break the whole application. To avoid this we NEVER WORK ON MAIN BRANCH. Instead, create another branch on top of Main branch and once you are ready to merge it into the main branch you can do so.

Funny story, when I did my first open-source contribution I didn't know this and then I went on a contributing spree all on the main branch and by the end of it, it all turned messy and I had to close all the pull requests. So do not be like me. Always Create a new branch while working on some issues and features.

Now what is this HEAD?

HEAD is just a pointer that by default points to main. HEAD say that all the new commit that will be made will be added on the HEAD (if it is still not clear, don't worry we will create all these doubts by visualization)

Ok so now you might be wondering how can you create these branches?

git branch <branch name as per your choice>

By this command, you can create a branch on top of the main branch. Now once you create a branch you need to switch to it or it will be of no use.

ATTENTION: Every time you create a branch make sure you first checkout to the main branch and then create a branch. Since every branch is built on top of the branch that you are checked out.

git checkout <name of branch you want to work on>

Always after making a branch, checkout and then do git push. Why?

To push the branch remotely as well. So after checkout do

git push

and then you will be prompted with something like

git push --set-upstream origin <Branch name>

Just copy-paste that and you will have that branch on your remote repository as well.

Ok now let's visualize this for better understanding.

Here you can see when I use the git branch issue-1, it created a separate branch named issue-1. Now let's switch to that branch by using git checkout issue-1(Keep an eye on that star on the right side of main)

You see that * now is pointing to issue-1. That * is nothing but HEAD that we talked about previously. So you can say that the HEAD now points to the issue-1 branch. So if you remember I said that all the new commits will be added to HEAD only and now the HEAD points to issue-1 so that means, whenever we will add a new commit it will be added to issue-1 and not main.

Now as you can see, whenever I am making a commit they are getting added to the issue-1 branch and it is the only branch that is moving forward while the main branch is not moving forward.

Ok, now it gets a little bit tricky so try to understand what I am going to say.

Another thing is that you are not the only one contributing to large code bases like Kubernetes. There are a lot of people working on the same code base and there will be a time when their changes get merged into the main branch while you are still working on your branch. which would look something like this

Do you see that the main branch is now ahead of our issue-1 branch because our branch is made on top of the previous main branch and now since it is updated with new content, we also need to update our branch.

Ok so now it can is possible that the feature you were working on, on the branch issue-1 is now finalized and is ready to merge. That's when you use

git merge <name of branch you want to merge into main>

Here you can see that the issue-1 branch is merged into the main branch. This merging happens via a PR (Pull request) which we will look into in a bit.

To check how many branches you have created use

git branch

and look for the * to know what branch you are on currently.

You can also delete a branch by using

git branch -d <branch name you want to delete>

A quick rundown of what we just learned.

git branch <branch name as per your choice> (Create a new branch)

git checkout <name of branch you want to work on> (Switch to the branch)

git branch (List all the branches)

git merge <name of branch you want to merge into main> (Merge a branch into the active branch)

git branch -d <name of branch you want to delete> (Delete a branch)

Some commands that might be useful

git branch -a (List all branch - local and remote)

git checkout -b <Branch name> (Create a new branch and switch to it)

git branch -m <old branch name> <new branch name> (Rename a local branch)

git checkout - (Switch to the branch last checked out)

git merge <source branch> <target branch> (Merge a branch into a target branch)

Working on already existing projects

So let's just say you want to work on a project that you like. For example, Reactjs, a front-end framework from Facebook.

This is their GitHub repository. But can't contribute to it directly. Why? because you don't have the permission to contribute to it directly. So what do you do?

You Fork the repository and then clone it.

Now I'll explain what that means and how to do it.

Once you do this you will land on a Forked repository page of your own.

Now what this means is that you have a copy of the FaceBook/React project and now you can work on it as you want without affecting the final application unless the people who have the permission to change this project merge your changes.

But before coding a feature on that forked repository, You need to set it up locally. To do that click on Code and copy the link to this repository.

Now make a folder on your computer and navigate to the folder in your IDE.

Now as you are in the folder you need to use

git clone <URL link that we just copied> .

This will set up the forked repository in your local machine.

This process is called cloning the project into your local machine.

A quick rundown of what we just learned.

git clone <URL link that we just copied> . (Clones the forked repository to your local machine)

Let's learn how to make a PR

As we are done setting up the project locally we can start contributing. I'll be making a bunch of commit to show you how to generate a PR. But I won't send a PR to the main repository to avoid sending trash PR to the maintainer of the project but I will walk you through each step.

So, here I'll be simply making a Markdown (.md) file. First, as I said always make a branch for whatever you are working on.

Now we are ready to generate a PR. just stage the commit and push your commit.

And now as the changes are pushed to the forked repository you can see them and create a pull request.

Click on Compare & pull request

Something like this will pop up and then you can add a title and detailed description of your feature or fix. After that just click on Create pull request. And just like that you have created your first PR.

Pull and Fetch

Now let's look at pull and fetch. These are used to update your forked repository with the newest updates from the base branch(the main branch of the main repository).

As you can see my main branch is 290 commits behind. That means that people have contributed to the project and 290 commits were merged and I don't have that in my cloned code base. So I have to update it by clicking on sync fork > Update branch.

Now go to your IDE and go to the branch that you just synced. In my case it's main branch.

Now as you can see I am checkout in my main branch and now all you need to do is

git pull

git pull: Used to fetch and download content from a remote repository and immediately update the local repository to match that content.

git fetch: Used to download contents from a remote repository.

A quick rundown of what we just learned.

git fetch

git pull

git pull upstream main (directly pull changes without syncing and using GitHub UI)