Module_3
Module_3
• If you are reading this, it’s possible that you are updating documents that look something like this: index-v12-old2.html.
Let’s get away from this and on to something that will not only allow you to control your source code and files, but
become more productive as a team. If this sounds familiar:
• Communicated with your team via email about updates.
• Made updates directly on your production server.
• Accidentally overwrote some files, which can never be retrieved again.
• You can now look forward to this instead:
• File names and directory structures that are consistent for all team members.
• Making changes with confidence, and even reverting when needed.
• Relying on source control as the communication medium for your team.
• Easily deploying different versions of your code to staging or production servers.
• Understanding who made a change and when it happened.
How Does Version Control Work?
• Version control systems allow multiple developers, designers, and team members to work together on the same project. It
helps them work smarter and faster! A version control system is critical to ensure everyone has access to the latest code
and modifications are tracked. As development becomes increasing complex and teams grow, there's a bigger need to
manage multiple versions and components of entire products.
• The basic concepts
• Tracking changes :A version control system is mostly based around one concept, tracking changes that happen within
directories or files. Depending on the version control system, this could vary from knowing a file changed to knowing
specific characters or bytes in a file have changed.
• In most cases, you specify a directory or set of files that should have their changes tracked by version control. This
can happen by checking out (or cloning) a repository from a host, or by telling the software which of your files you
wish to have under version control.
• The set of files or directories that are under version control are more commonly called a repository.
• As you make changes, it will track each change behind the scenes. The process will be transparent to you until you
are ready to commit those changes.
How Does Version Control Work?
• Committing: As you work with your files that are under version control, each change is tracked automatically. This can
include modifying a file, deleting a directory, adding a new file, moving files or just about anything else that might alter the
state of the file. Instead of recording each change individually, the version control system will wait for you to submit your
changes as a single collection of actions. In version control, this collection of actions is known as a commit.
• Revisions and Changesets
• When a commit is made, the changes are recorded as a changeset and given a unique revision. This revision could be
in the form of an incremented number (1, 2, 3) or a unique hash
(like 846eee7d92415cfd3f8a936d9ba5c3ad345831e5) depending on the system. By knowing the revision of a
changeset it makes it easy to view or reference it later. A changeset will include a reference to the person who made
the commit, when the change was made, the files or directories affected, a comment and even the changes that
happened within the files (lines of code).
• When it comes to collaboration, viewing past revisions and changesets is a valuable tool to see how your project has
evolved and for reviewing teammates’ code. Each version control system has a formatted way to view a complete
history (or log) of each revision and changeset in the repository.
Why VCS Is Important?
• Version control is important to keep track of changes — and keep every team member working on the right version. You
should use version control software for all code, files, and assets that multiple team members will collaborate on.
• It needs to do more than just manage and track files. It should help you develop and ship products faster. This is especially
important for teams practicing DevOps.
• That’s because using the right one:
• Improves visibility.
• Helps teams collaborate around the world.
• Accelerates product delivery.
What Is Version Control with Examples?
• Version control systems keep track of every change ever made. Here we give a list of the most popular version control
examples. Here are a few of the most popular types of VCS:
• GIT
• SVN
• ClearCase
• Mercurial
• TFS
Brief History of VCS
Generations of VCS:
Types Of VCS
• Local Version Control Systems: It is one of the simplest forms and has a database that kept all the changes to files under
revision control. RCS is one of the most common VCS tools. It keeps patch sets (differences between files) in a special
format on disk. By adding up all the patches it can then re-create what any file looked like at any point in time.
• Centralized Version Control Systems: Centralized version control systems contain just one repository globally and every
user need to commit for reflecting one’s changes in the repository. It is possible for others to see your changes by
updating. The benefit of CVCS (Centralized Version Control Systems) makes collaboration amongst developers along with
providing an insight to a certain extent on what everyone else is doing on the project. It allows administrators to fine-
grained control over who can do what.
• Distributed Version Control Systems: Distributed version control systems contain multiple repositories. Each user has their
own repository and working copy. Just committing your changes will not give others access to your changes. This is
because commit will reflect those changes in your local repository and you need to push them in order to make them
visible on the central repository. Similarly, When you update, you do not get others’ changes unless you have first pulled
those changes into your repository.
Different Operations and Mode Available in
VCS :
• There are different Version Control Systems (like Git and Subversion) and each has its own share of advantages and uses in
production systems. It has become a central part of the development process and hence, developing and using VCS like a
master is an important topic. Different VCS can look to be different but the basic operations are same.
Create
Checkout
• Create:
• operation here refers to creating a new repository. A repository is a database where all the edits or updates
of the source code are stored.
• Repository keeps track of the tree, i.e., all the files and the layout of the directories in which they are stored.
• A repository is a three-dimensional entity that exists in a continuum defined by directories, file and time.
• While creating a new repository, we need to specify the name and the location in which it has to be created.
• Checkout :
• The checkout operation is used to make a new working copy for a repository that already exists.
• A working copy is a snapshot of the repository used as a place to make changes. Though the repository is shared by
all the developers, the changes are made only to the working directory first, and then merged with the repository.
• A working copy is a private space to a developer and changes made to this working copy will not affect the rest of the
team.
• The working copy is actually more than just a snapshot of the contents of the repository, but also contains some
metadata so that it can keep careful track of the state of things.
Identify the Purpose of Operations:
• Commit and Update :Commit
• Operation is used to apply the modifications in the working copy to the repository as a new changeset.
• The commit operation takes the pending changeset and uses it to create a new version of the tree in the repository.
• The Update operation is used to update the working copy with respect to the repository.
• The Update is sort of like the mirror image of commit. Both operations are used to move changes between the
working copy and the repository.
• Commit goes from the working area to the repository. Update goes in the other direction.
• Add, Edit and Delete
• Add operation is used to add a file or directory to the working copy that is not added to version control yet, which
has to be added to the repository.
• Edit operation is used to modify a file.
• Edit doesn’t directly involve the VCS, edits are actually made using the text editor or the development environment
and the VCS will notice the change and make the modified file part of the pending changeset.
• Delete operation is used to delete a file or a directory.
• The delete operation will delete the working copy, but the actual deletion of the file in the repository is added to the
pending changeset.
Identify the Purpose of Operations:
Each and every independent software module is developed and constantly upgraded for performance as developers
discover new and effective means to improve the efficiency of the software module. CVS manages different versions of the
module so that if a future version encounters some defects, a past version can be referenced and used.
• CVS manages consistency among different files using three concepts. File locking is used to ensure that the file is modified
by one person at a time. The same file modified by different people can be monitored using the watch command.
• The CVS ensures appropriate policies to combat conflicts among files modified by the same developer. It supports an
option to include the modified versions in the same file by using appropriate delimiters.
• CVS offers security by using password authentication or Kerberos with generic security services application program
interface protocol. Finally, all changes made successfully can be saved using the commit command from the command line
interface.
What is CVS not?
• Is not a build system: Though the structure of your repository and modules file interact with your build system (e.g.
Makefiles), they are essentially independent. CVS does not dictate how you build anything. It merely stores files for
retrieval in a tree structure you devise.
• CVS does not dictate how to use disk space in the checked out working directories. If you write your Makefiles or scripts in
every directory so they have to know the relative positions of everything else, you wind up requiring the entire repository
to be checked out.
• If you modularize your work, and construct a build system that will share files (via links, mounts, VPATH in Makefiles, etc.),
you can arrange your disk usage however you like.
• But you have to remember that any such system is a lot of work to construct and maintain. CVS does not address the
issues involved.
• Of course, you should place the tools created to support such a build system (scripts, Makefiles, etc.) under CVS.
• Figuring out what files need to be rebuilt when something changes is, again, something to be handled outside the scope of
CVS. One traditional approach is to use make for building, and use some automated tool for generating the dependencies
which make uses.
• See Builds, for more information on doing builds in conjunction with CVS
What is CVS not?
• CVS is not a substitute for management. : Your managers and project leaders are expected to talk to you frequently
enough to make certain you are aware of schedules, merge points, branch names and release dates. If they don’t, CVS
can’t help.
• CVS is an instrument for making sources dance to your tune. But you are the piper and the composer. No instrument plays
itself or writes its own music
• CVS is not a substitute for developer communication.
• When faced with conflicts within a single file, most developers manage to resolve them without too much effort. But a
more general definition of “conflict” includes problems too difficult to solve without communication between developers.
• CVS cannot determine when simultaneous changes within a single file, or across a whole collection of files, will logically
conflict with one another. Its concept of a conflict is purely textual, arising when two changes to the same base file are
near enough to spook the merge (i.e., diff3) command.
• CVS does not claim to help at all in figuring out non-textual or distributed conflicts in program logic.
• For example: Say you change the arguments to function X defined in file A. At the same time, someone edits file B, adding
new calls to function X using the old arguments. You are outside the realm of CVS’s competence.Acquire the habit of
reading specs and talking to your peers..
What is CVS not?
• CVS does not have change control : Change control refers to a number of things. First of all it can mean bug-tracking, that
is being able to keep a database of reported bugs and the status of each one (Is it fixed? In what release? Has the bug
submitter agreed that it is fixed?). For interfacing CVS to an external bug-tracking system, see the rcsinfo and verifymsg
files (see Administrative files).
• Another aspect of change control is keeping track of the fact that changes to several files were in fact changed together as
one logical change. If you check in several files in a single cvs commit operation, CVS then forgets that those files were
checked in together, and the fact that they have the same log message is the only thing tying them together. Keeping a
GNU style Changelog can help somewhat.
• Another aspect of change control, in some systems, is the ability to keep track of the status of each change. Some changes
have been written by a developer, others have been reviewed by a second developer, and so on. Generally, the way to do
this with CVS is to generate a diff (using cvs diff or diff) and email it to someone who can then apply it using the patch
utility. This is very flexible, but depends on mechanisms outside CVS to make sure nothing falls through the cracks.
What is CVS not?
• Performance
• Git performs very strongly and reliably when compared to other version control systems. New code changes can be
easily committed, version branches can be effortlessly compared and merged, and code can also be optimized to
perform better. Algorithms used in developing Git take the full advantage of the deep knowledge stored within, with
regards to the attributes used to create real source code file trees, how files are modified over time and what kind of
file access patterns are used to recall code files as and when needed by developers. Git primarily focuses upon the
file content itself rather than file names while determining the storage and file version history. Object formats of Git
repository files use several combinations of delta encoding and compression techniques to store metadata objects
and directory contents.
• Security
• Git is designed specially to maintain the integrity of source code. File contents as well as the relationship between
file and directories, tags, commits, versions etc. are secured cryptographically using an algorithm called SHA1 which
protects the code and change history against accidental as well as malicious damage. You can be sure to have an
authentic content history for your source code with Git.
What are the advantages of Git?
• Flexibility
• A key design objective of Git is the kind of flexibility it offers to support several kinds of nonlinear development
workflows and its efficiency in handling both small scale and large scale projects as well as protocols. It is uniquely
designed to support tagging and branching operations and store each and every activity carried out by the user as an
integral part of “change” history. Not all VCSs support this feature.
• Wide acceptance
• Git offers the type of performance, functionality, security, and flexibility that most developers and teams need to
develop their projects. When compared to other VCS Git is the most widely accepted system owing to its universally
accepted usability and performance standards.
• Quality open source project
• Git is a widely supported open source project with over ten years of operational history. People maintaining the
project are very well matured and possess a long-term vision to meet the long-term needs of users by releasing
staged upgrades at regular intervals of time to improve functionality as well as usability. Quality of open source
software made available on Git is heavily scrutinized a countless number of times and businesses today depend
heavily on Git code quality.
How GIT is revolutionizing the IT?
• If business depends heavily upon it and software processes, or you’re a software development entity, Git radically changes
the way how your team will create and deliver work to you. Various processes including designing, development, product
management, marketing, customer support can be easily handled and maintained using Git in your organization.
• Feature Branch Workflow
How GIT is revolutionizing the IT?
• Git has powerful branching capabilities. To start work, developers have to first create a unique branch. Each branch
functions in an isolated environment while changes are carried out in the codebase. This ensures that the master branch
always supports production-quality code. Therefore, besides being more reliable it’s also much easier to edit code in a Git
branch rather than editing it directly using an external editor.
• Distributed Development
Why use Git?
• Performance: Git provides the best performance when it comes to version control systems. Committing, branching,
merging all are optimized for a better performance than other systems.
• Security: Git handles your security with cryptographic method SHA-1. The algorithm manages your versions, files, and
directory securely so that your work is not corrupted.
• Branching Model: Git has a different branching model than the other VCS. Git branching model lets you have multiple
local branches which are independent of each other. Having this also enables you to have friction-less context
switching (switch back and forth to new commit, code and back), role-based code (a branch that always goes to
production, another to testing etc) and disposable experimentation (try something out, if does not work, delete it without
any loss of code).
• Staging Area: Git has an intermediate stage called "index" or "staging area" where commits can be formatted and
modified before completing the commit.
• Distributed: Git is distributed in nature. Distributed means that the repository or the complete code base is mirrored onto
the developer's system so that he can work on it only.
• Open Source: This is a very important feature of any software present today. Being open source invites the developers from
all over the world to contribute to the software and make it more and more powerful through features and additional
plugins. This has led the Linux kernel to be a software of about 15 million lines of code.
How Git is aligning with Linux
• No VCS was used by Linus Torvalds initially. Patches developed by Kernel contributors were manually applied by Linux.
• Linux was not satisfied with any of the VCS available at that time, but kernel developers insisted to have one.
• In 2002, Linus opted Bitkeeper, a closed source commercial system developed by BitMover company.
• BitMover imposed certain restrictions on the Linux community in exchange for the free license and demanded control
over some metadata.
• In 2005, due to a contract breach, BitMover revoked all the licenses issued to the Linux community and they were forced
to have an alternative.
• Linux began writing Git in 2005, and in June 2005, Linus' git revision control system had become fully self-hosting
• We have already seen that Git originated because of the licensing issue that the Linux community faced with Bitkeeper.
Let's have a detailed look at how the Linux community managed their source code earlier and how Linus Torvalds ended
up creating Git on his own.
How Git is aligning with Linux
• Initially there was no version control at the Linux community and Linus Torvalds, the creator of Linux, was manually
applying the patches to the source tree, whenever the contributors submitted their patches. Though open source VCS like
CVS were around that time, there were many limitations to it. Particularly, the way CVS tracked the changes and conflict
management were troublesome. Linus was not a great fan of SVN as well.
• Not having a proper revision control system made the kernel developer community unhappy and Linus was forced to
choose a version control system.
• To everyone's shock, Linux, one of the greatest advocates and practitioners of open source, selected Bitkeeper, a closed-
source commercial VCS provided by BitMover company. This created a lot of controversy among the kernel development
community, but that did not change Linux ' standpoint. Bitkeeper's major claim was that it offered a distributed system,
that no provider offered at that time. Linux was particular about using Bitkeeper mainly because of this reason.
• The major drawback was that Bitkeeper imposed certain restrictions on the kernel development community in exchange
for the free license. One major condition was that Linux developers should not work on any competing revision control
projects while using Bitkeeper.
How Git is aligning with Linux
• Second was that BitMover, in order to monitor any abuse to the free license, would control certain metadata of the kernel
project. Kernel developers could not compare previous kernel versions without the metadata. Despite these conditions,
Linus continued to use Bitkeeper for years.
• The concept of distributed version control system may sound familiar now, but in 2002 it was a completely new idea. Linus
was inclined towards Bitkeeper, since sub-groups of kernel developers could collaborate independently with the benefit of
revision control and then feed their changes up to Linus when they were ready.
• This way, a huge portion of the work done by Linus (applying the patches manually) could be distributed to any other
group. Many features could be developed independently and merged to the Linux source tree. None of the open source
• VCS available at that time provided these advantages. In 2005, Andrew Tridgell, the creator of Samba, tried to reverse
engineer Bitkeeper to create an open source alternative. This was against the contract and Bitkeeper revoked the licenses
issued to the Linux community and there arose a sudden sense of uncertainty.
How Git is aligning with Linux
• To get over this, Linus stopped working on the Linux kernel, the first such incident since its inception in 1991, and started
writing his own version control system. And that's how Git was born.
• From the outset, Torvalds had one philosophical goal for Git-to be the anti-CVS-plus three usability design goals:
• Support distributed workflows similar to those enabled by Bitkeeper
• Offer safeguards against content corruption
• Offer high performance Within days of development, in June 2005, Linus' git revision control system had become
fully selfhosting and within a few weeks it was ready to host Linux kernel development. Within a few months there
was a huge participation from the community for further development of Git. And now, it has become an essential
product in every developer's life.