Cbook
Cbook
net/publication/282785406
C for Biologists
CITATIONS READS
0 4,618
1 author:
Jeyakodi Gopal
Indian Council of Medical Research
12 PUBLICATIONS 30 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Jeyakodi Gopal on 13 October 2015.
Acknowledgement
1 INTRODUCTION
1.1 Classification of Programming Language --- --- 1
2 C FUNDAMENTALS
2.1 History --- --- 13
2.2 Features of C --- --- 14
2.3 Applications and Advantages of C --- --- 14
2.4 Basic structure of C program --- --- 15
2.5 Executing a C program --- --- 16
2.6 Character set in C --- --- 17
2.7 C Tokens --- --- 18
2.8 Declaration of variables --- --- 24
2.9 User defined type declaration --- --- 25
2.10 Assigning values to variables --- --- 26
2.11 Compile time assignment --- --- 26
2.12 Data Input and output statements --- --- 27
2.12.1 Character Input and Output --- --- 27
2.12.2 String Input and Output --- --- 28
2.12.3 Formatted Input and Output --- --- 29
2.13 Operators and Expressions --- --- 32
2.13.1 Arithmetic Operators --- --- 33
2.13.2 Relational Operators --- --- 34
2.13.3 Logical Operators --- --- 34
2.13.4 Assignment Operators --- --- 35
2.13.5 Increment and Decrement Operators --- --- 36
2.13.6 Conditional Operators --- --- 37
2.13.7 Bitwise Operators --- --- 37
2.13.8 Special Operators --- --- 38
2.14 Arithmetic Expression --- --- 39
2.15 Precedence of Arithmetic Operators --- --- 40
2.16 Type conversion in Expressions --- --- 42
2.16.1 Implicit type conversion --- --- 42
2.16.2 Explicit conversion --- --- 43
2.17 Operator precedence and associativity --- --- 44
3 CONTROL STATEMENTS IN C
3.1 Control statements --- --- 51
3.2 Branching statements --- --- 52
3.2.1 Conditional branching statements [ if statement ] --- --- 52
3.2.2 Conditional branching statements [ if – else – if statement ] --- 56
3.2.3 Conditional branching statements [ Nested – if statement ] --- 60
3.2.4 Conditional branching statements [ switch – case ] --- --- 64
3.3 Looping statements --- --- 71
3.3.1 The while statement --- --- 71
3.3.2 The do while statement --- --- 73
3.3.3 The for statement --- --- 75
4 ARRAYS
4.1 One dimensional array --- --- 85
4.1.1 Declaration --- --- 85
4.1.2 Memory representation --- --- 86
4.1.3 Initialization --- --- 86
4.2 Two dimensional array --- --- 93
4.2.1 Declaration --- --- 93
4.2.2 Memory representation and Initialization --- --- 94
4.3 Multi dimensional array --- --- 97
5 FUNCTIONS
5.1 What is Function --- --- 98
5.2 Advantages of Function --- --- 98
5.3 Types of Function --- --- 98
5.3.1 Built-in / Library Functions --- --- 98
5.3.1.1 Mathematical Functions --- --- 99
5.3.1.2 Character Functions --- --- 100
5.3.2 User defined Functions --- --- 102
5.3.2.1 Elements of user defined Functions --- --- 102
5.4 Category of Functions --- --- 106
5.5 Recursion --- --- 108
5.6 The scope of a variable --- --- 110
5.6.1 Automatic variables --- --- 110
5.6.2 External variables --- --- 111
5.6.3 Static variables --- --- 113
5.6.4 Register variables --- --- 114
6 STRINGS
6.1 String Declaration --- --- 115
6.2 String Initialization --- --- 115
6.3 Reading strings from terminal --- --- 116
6.4 Writing strings to terminal --- --- 116
6.5 Reading a line of text --- --- 117
6.6 String Handling functions --- --- 117
7 POINTERS
7.1 What is a pointer --- --- 137
7.2 Advantages of pointers --- --- 138
7.3 Pointer variables --- --- 138
7.3.1 The address operator --- --- 138
7.3.2 The indirection operator --- --- 138
7.4 Declaring pointer variables --- --- 138
7.5 Pointers and Arrays --- --- 141
7.6 Pointer Arithmetic --- --- 141
7.6.1 Incrementing pointer --- --- 141
7.6.2 Decrementing pointer --- --- 141
7.7 Pointer and Functions --- --- 143
7.8 Function call by value and Function call by reference --- --- 144
9. FILE PROCESSING IN C
9.1 Defining and Opening a File --- --- 159
9.2 Closing a File --- --- 160
9.3 File Input and Output --- --- 160
9.3.1 Character Input and Output --- --- 161
9.3.2 String Input and Output --- --- 161
9.3.3 Formatted Input and Output --- --- 161
References --- --- 166
Acknowledgment
I wish to thank Prof P.P Mathur, Centre Head, Centre for Bioinformatics, Pondicherry
University, Puducherry who inspired me to write this book.
My sincere thanks to Mr. M. Sundaramohan, Information Officer, Pondicherry University for his
valuable suggestions in composing this work, and Dr. Ayalaru Murali, Assistant Professor,
Centre for Bioinformatics, Pondicherry University, for his timely help in reviewing this work
and in the successful completion of the book.
My heartfelt thanks to all my colleagues, for their undivided support and encouragement and to
all my students, who motivated me to write this book.
I am very much grateful to my husband, son and my family members for their continuous
support in writing this book.
G. Jeyakodi
1
Introduction
Language is a medium of communication. The set of instructions given to computers to do its
work is called programming language. Computer programming languages are developed with
the primary objectives of facilitating a large number of people to use computer without the
need to know the details of internal structure of the computer. Using set of instructions or
statements the users can develop a program or software which is called programming.
Example: BASIC, COBOL, PASCAL, C, C++, JAVA, etc
To be confined to the scope of the book and considering the limitation of space, this chapter
is aimed at giving a very brief account of history of computer languages and various tools
needed for efficient programming skills. The readers who are interested in detailed account
on these topics are encouraged to refer to the books listed in the reference section.
All computer languages can be classified broadly into following three categories:
Computer language
The computer language that the computer directly understands is called machine language of
the computer. This language is composed of sequence of 1‟s and 0‟s. The circuitry of a
computer is wired in a manner that it recognizes the machine language instructions
immediately, and converts them into electrical signals needed to execute them.
A machine language instruction normally has a two-part format .The first part is operation
code and the second part is operand. Operation code tells the computer what function to
1
perform, and the Operand tells the computer what operation to perform and the length and
locations of the data fields involved in the operation. Every computer has a set of operation
codes called its instruction set. Each operation code (or opcode) in the instruction set is meant
to perform a specific basic operation or function. Arithmetic operations, Logical operations,
Branch operations and Data movement operations are the typical operations included in the
instruction set.
Example
ADVANTAGES:
DISADVANTAGES:
Machine dependant
Programming is very difficult
Difficult to understand
Difficult to write bug free programs
Difficult to isolate an error.
The language that allows instructions and storage locations to be represented by letters and
symbols instead of numbers is called assembly language or symbolic language. A program
written in an assembly language is called assembly language program or symbolic program.
Since the assembly language program is developed by symbol, computer cannot execute the
program directly. It require translator for converting the assembly code into machine code.
“Assembler is the system software that translates the assembly code into machine code”.
Mnemonic Meaning
2
ADVANTAGES:
DISADVANTAGES:
The language which is composed of normal English like statement is called High level
language. This program can easily understand by the programmer but it require translator for
converting the code into machine understandable form. “Compiler is system software that
converts the high level program into machine code”.
ADVANTAGES:
Machine independent
Easier to learn and use
Fewer errors
Easier to maintain
DISADVANTAGES:
Less efficiency
Require translator
The fourth generation languages are very simple. The third generation languages are
considered to be as procedural language. The fourth generation languages have a minimum
number of syntax rules hence common people use this language easily.
i. Query languages
ii. Report Generation languages
iii. Application Generation language
3
ADVANTAGES:
DISADVATAGES:
Fifth generation languages might be the future of programming languages. These languages
will be able to process natural process. The user will be free from learning any programming
language to communicate with the computer.
The problem solving technique is a task of expressing solution to the complex problems in
terms of simple operations understood by the computer. Problem solving using a computer
require a well-defined sequence of steps in a systematic manner.
Problem definition.
Problem analysis.
Design of a solution using design tools such as Flowcharts and Algorithms.
Coding or programming
Checking and correcting errors.
Documentation
Program Enhancement or Maintenance
The basic concept, which should be clearly understood, is that “Success can be achieved in
problem solving only when we know what we want to do”. i.e., the problem should be clearly
understood first by the user. A proper involvement in this process always helps in generating
a good workable solution.
This step of problem solving analyze what must be done rather than how to do it. Clear
understanding of the problem is must. This step also requires the exact specification of the
problem. The specification may be in the user‟s language such as English or some other
natural language. It may include charts, tables and equations of different kinds. The
4
knowledge about the specification can be gathered using certain techniques such as
observation of the actual task, interviews and so on.
Example 1.1
Consider we have a set of visiting cards, each card containing a name, address and a
telephone number. The problem is to find the telephone number corresponding to any given
name.
In the first stage of problem solving it is necessary for us to be more precise about the
structure of the input data. We may want to know whether the visiting cards are arranged in
alphabetic order of names or are they in a random order. We will assume that the cards are
not in order but we can lead through the cards one at a time from the first to last. Next we
must also decide what action we would perform if the name were not present in any of the
visiting cards. The problem was not precisely defined because it did not tell us about this.
Therefore, the problem statement is to be corrected by indicating that the output should be
either the phone number of the person or a message that the name is not present.
The problem analysis step focuses on understanding the requirements of the problem to be
solved. This process is the first step towards the solution domain. Explicit requirements about
the input – output, time constraints, processing requirements, accuracy, memory limitations,
error handling and interfaces are understood at this stage. The end result of this analysis is the
selection of a method, which is to be used on the computer or a decision that a computer
should not be used because of constraints as it may be seen that manual methods are better.
To completely and properly specify the input to a program it require to answer certain
questions such as:
Similarly to provide the output of a program it is necessary to answer certain questions such
as: What are the outputs generated by the program? (The outputs should be clearly and
unambiguously specified)
1. What is the format of the outputs? ( Which includes type, accuracy and the units)
2. What is the output device?
3. How the outputs should be displayed? ( Which includes spacing, layout and heading)
5
Example 1.2: Telephone number search problem
In the analysis of the problem, it is decide that the only reasonable method given the structure
of the input data is to look at each card one after the other and compare the name it contains
with the name being searched. If the card is found stop and report the output otherwise
continue with the remaining cards one after the other if the search reaches the end of the
cards without finding the name, we output the message that the name is not present.
After the completion of problem definition and problem analysis it is necessary to define the
solution of the problem. The solution should include a sequence of steps that will input and
manipulate the data and produce the desired output. The process of good designing can be
done efficiently with the choice of certain design tools. Algorithms and Flowcharts are the
two design tools, helps to represent the solution of a problem.
One more design technique is Top down Design used to solve the complex problems more
effectively by dividing the problems into sub problems. Sub problems are easier to solve than
the complete problem.
I. ALGORITHM
CHARACTERISTICS OF AN ALGORITHM
Example 1.3
Step1: Start
Step6: Stop
6
Steps involved in developing an algorithm
1. Clearly understand the problem statement so that a proper algorithm can be evolved.
2. Study the outputs to be generated so that the input can be specified.
3. Design the process, which will produce the desired result after taking the input.
4. Refine the process.
5. Test the algorithm by giving the test data and see if the desired output is generated. If
not, make appropriate changes in the process and repeat the process.
ADVANTAGES OF ALGORITHM
Easy to understand
It has got a definite procedure, which can be executed within a set period of time.
It is easy to first develop an algorithm, and then convert it into a flowchart and then
into a computer program.
It is independent of programming language.
It is easy to debug as every step has got its own logical sequence.
DISADVANTAGES OF ALGORITHM
II. FLOWCHART
1. Clearly understand the problem statements so that a proper flowchart can be evolved.
2. Study the outputs to be generated so that the input can be specified.
3. Design the process, which will produce the desired result after taking the input.
4. Refine the process.
5. Test the flowchart by giving the test data and see if the desired output is generated. If
not, make appropriate changes in the process and repeat the process.
7
FLOWCHART SYMBOLS
For easy visual reorganization, standard conventions are used for drawing flowcharts.
S. NO.
The flowchart symbolsDESCRIPTION SYMBOLS
as given below are as per connection followed by International
Standard Organization (ISO)
1. FLOW DIRECTION: The direction of processing or data flow
Advantages of flowcharts
Disadvantages of flowcharts
1. It is time consuming.
2. Representation of complex logic is very difficult.
3. Alternation and modifications can be made only by redrawing the flowchart.
8
Example 1.4
Start
Input ntype
F If ntype = T
RNA
Output Deoxy
Ribonucleic Acid Output Wrong
Nucleotide type
Stop
9
1.2.1.5 CHECKING AND CORRECTING ERRORS - DEBUGGING
Debugging is the process of isolating and correcting the errors in a program. Debugging is a
very important and time-consuming phase of software development. The process of
debugging ensures that the program does what the programmer intends to do. This stage is
also referred to as verification.
1. The program is first compiled. During the process of compilation the compiler detects
certain types of errors called as syntax errors.
2. All the detected errors are corrected and the program is recompiled. The process is
repeated till no errors are displayed.
3. The program is then executed. During the execution of a program certain errors called
as semantic errors are detected. Semantic errors normally occur due to the wrong use
of logic.
4. Once all the errors are corrected the program is recompiled again and executed.
COMPILER: It is software which checks the entire program created by the user for a certain
type of error called as syntax errors. If the program is error free the complete program is
translated to its equivalent machine language program.
Source program: The program created by the user in a high level language is referred to as
source program.
Object program: The machine language program generated by the compiler is referred to as
object program.
Syntax: It refers to the set of rules, which should be followed while creating every statement
and structure in a program.
Syntax error: An error, which occurs due to the wrong use of syntax, is referred to as syntax
error.
Semantic errors: An error, which occurs due to the wrong use of logic, is referred to as
semantic error.
Logic error: If the correct translation of an algorithm or flowchart causes the program to
produce wrong results. Such error is referred to as logical errors.
Runtime errors: Errors, which are detected during the execution of a program, are referred
to as runtime errors.
Testing: It is the process of checking whether the program works correctly according to the
requirements of the user. i.e., whether the program generates the correct results for a given
input of data.
10
The correctness of the program ca be determined by trying a large number of carefully
chosen data and then by seeing if the program generates the correct output in all those cases.
If debugging is referred to as verification, testing is referred to as validation.
Comment statement
System manual
User manual
Comments
Comments are natural language statements put within a program to assist anyone reading the
source program listing in understanding the logic of the program. They do not contain any
program logic, and are ignored by a language processor.
System manual
Problem definition
Software description
List of program names and its description
Detailed system flowchart and program flowchart
Source listing of all the programs
Specification of all input and output media
11
Specimen of all input forms and printed outputs.
File layout, that is, the detailed layout of input and output records.
Structure of description of all test data, test results, storage dumps, trace
program printouts, etc., used to test and debug the programs.
User manual
Software must have a good user manual to ensure its easy and smooth usage. It is the user,
not the developer or the programmer, who will regularly use the software and it is installed
and commissioned for use. User manual must contain the following:
Maintaining is the process updating or upgrading new versions of programs so that it can
meet the present requirements of the user.
NEED
New errors or bugs, which were not detected during development and testing, were
detected during the usage of the program.
The needs of the user have changed over a period of time and the program has to be
modified to meet the present day needs.
The existing program is not functioning to the satisfaction of the user, thus the
program has to be modified to provide the additional information.
The user has purchased a new computer or hardware and the program is not
functioning properly on the new system.
The user has seen the use of new types of software using Graphical User Interface
(GUI) and thus feels that his existing system has to be changed to the new system.
12
2
C Fundamentals
2.1 HISTORY
Dennis M. Ritchie, a systems engineer at Bell Laboratories, New Jersey developed C in the early
1970‟s, which is now part of AT & T Bell Labs, USA.
The root of modern languages is ALGOL, introduced in early 1960‟s. ALGOL was the first
computer language to use the block structure. It gave the concept of structured programming
to the computer science community.
In 1967, Martin Richards developed a language called BCPL (Basic Combined Programming
Language) primarily for writing system software. In 1970, Ken Thompson created a language
using many features of BCPL and called it simply B. B was used to create early versions of
UNIX operating system at Bell Laboratories.
C was evolved from ALGOL, BCPL and B by Dennis Ritchie at Bell Laboratories in 1972. C
uses many concepts from these languages and added the concept of data types and other
powerful features. Since it was developed along with the UNIX operating system, it is
strongly associated with UNIX. This operating system was coded almost entirely in C.
To assure that the C language remains standard, in 1983, American National Standards
Institute (ANSI) appointed a technical committee to define a standard C. The committee
approved a version of C in December 1989 which is now known as ANSI C. It was then
approved by International Standards Organization (ISO) in 1990. This version of C is also
referred to as C89.
During 1990‟s, C++, a language entirely based on C, with a number of improvements and
changes was developed. During the same period Sun Microsystems of USA created a new
language Java modeled on C and C++.
13
2.2 FEATURES OF C
C is a powerful, flexible language that has gained worldwide acceptance in
recent years.
C is concise, yet powerful in its scope.
It is becoming the standard for program development on small machines and
microcomputers.
Modularity (process of dividing problems into sub problems) makes C ideal
for projects involving several programmers.
It is widely available for most brands and type of computers.
It is portable.
C language is well suited for structured programming.
With all the features C is not a difficult language, it is one of the easiest
languages to learn and implement.
The capability of C language to work at machine level, makes it well suited for systems
programming. A systems program is a part of a large class of programs, which are developed
for the purpose of simplifying the process of using the system. Some important systems
programs include Operating Systems, Compilers, Interpreters, Assemblers and Editors.
Being a structured language, C is also very useful in developing large application programs.
Some important application programs include Word Processors, Spreadsheets, CAD
applications, animation and games.
ADVANTAGES OF C
It has a wide variety of derived data structures like pointers, arrays, structures and
unions apart from fundamental data types like integers, floating point numbers and
characters (discussed at length in Chapter 4,5).
Programs written in c are found to execute faster compared to other languages.
Provides a rich set of built-in functions.
It is easily expandable to meet the requirements.
It has the ability to deal efficiently with bits, bytes, word, addresses etc.,
14
2.4 BASIC STRUCTURE OF C PROGRAM:
Documentation section
Link section
Definition section
Declaration
part
Executable
part
}
Subprogram section
Function 1
Function 2
-
-
Function n
The documentation section consists of a set of comment lines giving the name of the
program, the author and other details. The link section provides instructions to the compiler
to link functions from the system library. The definition section defines all symbolic
constants.
There are some variables that are used in more than one function. Such variables are called
global variables and are declared in global declaration section that is outside all the
functions. This section also declares user-defined functions.
Every C program must have one main() function section. This section consists of two parts,
declaration part and executable part. The declaration part declares all the variables used in
executable part. There is at least one statement in the executable part. These two parts must
enclose between the opening and closing braces. The program execution begins at the
opening brace and ends at the closing brace. All statements in the declaration and executable
parts end with a semicolon (;).
The subprogram section consists of all user-defined functions that are called in the main
function user-defined functions are generally placed immediately after main function,
15
although they may appear in any order. All sections, except main function section may be
absent when they are not required.
2. Compilation phase
3. Execution phase
In the Editing phase the user enters the program. The program created by the user is called as
the source program. The creation of the source program is done with the help of software
called as the editor.
Editor : A software used to interactively review and modify text materials and other
program instructions.
In the Compilation phase the compiler first checks the program for syntax errors. Once all the
errors are corrected the program is then converted into its equivalent machine language code
also called as the object program. The compilation is performed by software called as the
compiler.
In the execution phase the program is executed to check whether it is giving the proper result
or not.
Executing a C program
C program can be executed in the different platform such as Windows, UNIX and LINUX.
Though the executing a source program is same in all operating systems, the way of approach
is slightly different and specific to the operating system. In the next two sections, executing a
source code in different environments is explained.
In windows platform, Turbo C and Borland C are the commonly available C compilers.
These compilers have in-built editors also. After installing Turbo c compiler, the new file can
be created by choosing the New option from the File menu. After entering the source code
the file should be save in the filename.c extension.
C program must be compiled and linked with necessary libraries to build an executable
version (.exe) of that program. The program can be compiled by pressing Alt+C option. If the
compiler displays any error it should be removed, saved and compiled again. Once the source
code is error free, the object file (filename.obj) and the executable file (filename.exe) are
16
created. Press Alt + R, to execute or run the program. The results are viewed by pressing
Alt+F5 key. Once the exe file is created it can run in the DOS environment also by simply
typing the name of the exe file the command prompt.
In UNIX and LINUX platforms, the source program can be created in any one of the
standard editor vi, gedit or gvim. The source code should save under the filename.c extension.
(Ex: welcome.c). The following command is used for compiling the source code.
After compiling the source code, the compiler generates a.out if it is error free. If there are
any errors it should be debug and recompiled. Once the file a.out is created it can be run in
the command prompt by typing the following command
To store the executable file in any other name than the default file name a.out, the option –o
is used as, cc –o welcome welcome.c or gcc –o welcome welcome.c
Now, the executable file is stored in the file name welcome. So to run the program instead of
using a.out, it can be execute by using the filename welcome as
./welcome
Note: In linux, “-lm” option is used during compilation to link math library and the
compilation command would be
cc –lm pH.c
Numerals ( 0 to 9 )
=!&%<>``‘.#\
17
2.7 C TOKENS
In a passage of text, individual words and punctuation marks are called tokens. Similarly, in
C language the smallest individual units are known as C tokens. C has six types of tokens as
shown in the figure.
C TOKENS
molecule {}
seq_length []
Every C word is classified as either a keyword or identifier. All keywords have fixed
meanings and these meanings cannot be changed. Keywords serve as basic building blocks of
program statements. All keywords were written in lower case. The lists of all ANSI C
keywords are listed below:
18
ii) IDENTIFIERS
Identifiers refer to the names of variables, functions and arrays. These are user-defined names
and consist of a sequence of letters and digits, with a letter as a first character. Both
uppercase and lowercase letters are permitted, although lowercase letters are commonly used.
The underscore character is also permitted in identifiers.
Constants in C refer to fixed values that do not change during the execution of a program. C
constants are illustrated in the figure 2.1 as given below:
iii) CONSTANTS
19
a. Integer Constants
An integer constant refers to a sequence of digits. There are three types of integers, namely,
Decimal integer
Octal integer
Hexadecimal integer.
Embedded spaces, commas and non-digit characters are not permitted between digits.
For example,
12 34 10,000 $298
An octal integer constant consists of any combination of digits from the set 0 through 7, with
a leading 0. Some examples are:
The largest integer that can be stored is machine dependent. It is 32767 on 16-bit machines
and 2,147,483,647 on 32-bit machines. It is also possible to store larger integer constants on
these machines by appending qualifiers such as U, L and UL to the constants.
Examples:
20
b. Real Constants
Integer numbers are continuous values. They are inadequate to represent quantities such as,
distances, temperature, and prices and so on. These quantities are represented by numbers
containing fractional parts like 23.89. Such numbers are called real (or floating point)
constants.
Examples:
A real number may also be expressed in exponential (or scientific) notation. For example, the
value 215.65 may be written as 2.1565e2 in exponential notation. E2 means multiply by 102.
The general form is:
mantissa e exponent
The mantissa is either a real number expressed in decimal notation or an integer. The
exponent is an integer number with an optional + or – sign. The letter e separating the
mantissa and exponent can be written in either lower or uppercase.
Examples:
Examples:
d. String constants
A string constant is a sequence of characters enclosed in double quotes. The characters may
be letters, numbers, special characters and blank spaces.
Examples:
NOTE:
The character constant „a‟ is not equivalent to the string constant “a”. Further, a single
character string constant does not have an equivalent integer value while a character constant
has an integer value.
21
e. Backslash Character Constants
C supports some backslash character constants that are used in output functions. A list of
such backslash character constants are listed in the Table 2.1 as given below:
Constant Meaning
„\a‟ Audible alert (bell)
„\b‟ Back space
„\f‟ Form feed
„\n‟ New line
„\r‟ Carriage return
„\t‟ Horizontal tab
„\v‟ Vertical tab
„\” Single quote
„\”‟ Double quote
„\?‟ Question mark
Fig. 2.1 Constants Hierarchy
„\\‟ Backslash
„\0‟ Null
iv) VARIABLES
A variable name is a data name that may be used to store a data value. Unlike constants that
remain unchanged during the execution of a program, a variable may take different values at
different times during the execution. A variable name can chosen by the programmer in a
meaningful way so as to reflect its function or nature in the program. Some examples are:
Nucleotide
Amino_acid
Molecule_structure
22
ANSI standard recognizes a length of 31 characters. However the length should not be
normally more than eight characters, since only first eight characters are only treated
significant by many compilers.
v) DATA TYPES
C language is rich in its data types. The varieties of data types are available to allow the
programmer to select the type appropriate to the needs of the application as well as the
machine.
ANSI C supports three classes of data types:
All C compilers support five fundamental data types, namely integer (int), character (char),
floating point (float), double-precision floating point (double) and void. Many of them also
extend data types such as long int and long double.
In order to provide some control over the range of numbers and storage space, C has three
storage classes of integer namely short int, int and long int, in both signed and unsigned
forms. Short int represents fairly small integer values and requires half the amount of storage
as a regular int number uses. Unlike signed integers, unsigned integers use all the bits for the
magnitude of the number and are always positive. We declare long and unsigned integers to
increase the range of values. The Use of Qualifier Signed on integers is optional because the
default declaration assumes a signed number.
short int
int
long int
23
Fig. 2.2 Integer modifiers
float
double
long double
The declaration of variables must be done before they are used in a program.
Where v1, v2 …. vn are the names of the variables. Variables are separated by commas.
A declaration statement must end with a semicolon.
24
For example int seq_length;
double ratio;
Where type refers to the existing data type and identifier refers to the new name given to the
data type.
Example:
Here length symbolizes int and base symbolizes char. They can be later used to declare
variables as follows:
Another user defined data type is enumerated data type provided by ANSI standard. It is
defined as follows:
the “identifier is enumerated dat type which can be used to declare variables that can have
one of the values enclosed within braces (known as enumeration constants). After this
definition, we can declare variables to be of this “new” type as below:
The enumerated variables v1, v2,…vn can only have values value1, value2, …valuen.
v1=value3; v5=value1;
Example:
25
week_st=Monday;
week_end=Friday;
if(week_st==Tuesday)
week_end=Saturday;
The compiler automatically assigns integer digits beginning 0 to all enumeration constants.
That is, the enumeration constant value1 is assigned 0, value2 is assigned 1, and so on.
However, the automatic assignments can be overridden by assigning values explicitly to the
enumeration constants. For example:
Here, the constant Monday is assigned the value of 1. The remaining constants are assigned
values that increase successively by 1.
The definition and declaration of enumerated variables can be combine din one statement.
Example;
variable_name=constant;
Example: Length=100;
It is also possible to assign a value to a variable at the time of variable declaration. It takes the
following form:
Example:
int mol_wt=150;
char base=‟A‟;
We can assign the values to the variables in two ways. They are compile time assignment and
execution time assignment.
Example:
26
int total=68;
float weight=12.3;
Example:
main()
nucleotide1=‟a‟;
nucleotide2=‟t‟;
Data input and output statements are the most important statements as they are used to
facilitate transfer of information or data between the computer and the standard input/output
devices ( e.g., Keyboard, Mouse, Monitor, Data Files, etc.)
In C, getchar() and putchar() functions are used for character input and output from the
standard I/O devices.
Syntax of input:
char variable = getchar();
The above function read a single character from the keyboard and assign it to the character
variable specified.
27
Syntax of output
putchar(char variable);
The above function display a single character specified by the argument onto the monitor.
main()
{
char x;
x = getchar();
putchar(x); // Displays a character on the screen
putchar('\n'); // New line
putchar(tolower(x)); // Displays a lower case letter
}
Sample I/O
B
B
b
char string_name[size];
Example:
char sequence[50];
The gets() and puts() functions are used to read and display string from the standard I/O
device.
28
The above function read a string from the keyboard and assign it to the string variable.
The above function display a string specified in the string variable on the screen.
main()
{
char string[40];
gets(string);
puts(string);
}
Sample I/O
c for biologists
c for biologists
The scanf() and printf() functions are used to read and print formatted input and output from a
standard I/O device.
The control string contains the conversion characters (preceded with % symbol) for each data
item to be read in. The commonly used conversion characters for different data types are
given in Table 2.2 given below.
29
Conversion character Data type
c char
s string
d decimal
o octal
u Unsigned decimal
f float
The variable name in the variable list must proceed with the address operator (&).
Example
After the execution of the scanf() statement we get, sex=‟M‟, age=35, salary=25000.00 and
name=”Rajan”
The printf() function prints the data onto any standard output device in the format specified
by the control sting and using the values of the variables var-1,var-2,….var-n.
Example
printf(“Sex=”%c”,sex,”\nAge=%d”,age,”\nName=%s”,name);
30
Name=Rajan
C provides the way to specify the width of the data being displayed by using the %w
followed by the conversion characters for the data type. For example %7d specify the integer
data of 7 characters width. The output is displayed in the right justified manner.
It also provide the way to specify the number of decimal places (n) also for a floating-point
data type by using the format %w.nf .For example the format of %12.3f indicates that the
floating-point data has a width of 12 characters of which the decimal places are represented
in 3 decimal places, one character is used for decimal point and the remaining 8 character
width is used for the integer part.
#include<stdio.h>
main()
{ printf("welcome to C programming\n");
printf("Bioinformatics is the combination of\n");
printf("Biology, Information Technology and
Statistics");
}
Sample I/O
Welcome to C programming
Bioinformatics is the combination of
Biology, Information Technology and Statistics
31
Sample Program 2.4
Write a C program to illustrate the scanf() and printf() function
#include<stdio.h>
main()
{
int pH;
char name[20];
float mol_wt;
printf("Enter amino acid name,pH value: ");
scanf("%s %d",&name,&pH);
printf("Enter molecular weight: ");
scanf("%f",&mol_wt);
printf("\nAmino acid details\n");
printf("Name:%s",name);
printf("\npH value:%d",pH);
printf("\nMol.wt:%12.4f",mol_wt);
}
Sample I/O
C supports a rich set of operators. An operator is a symbol that tells the computer to perform
certain mathematical or logical manipulations. They usually form a part of mathematical or
logical expressions.
1. Arithmetic operators
2. Relational operators
32
3. Logical operators
4. Assignment operators
5. Increment and decrement operators
6. Conditional operators
7. Bitwise operators
8. Special operators
10+10
Is an expression whose value is 20. The value can be any type other than void.
C provides all the basic arithmetic operators which are listed below:
Operator Meaning
* Multiplication
/ Division
% Modulo division
When both the operands in an expression are integers, the expression is called as integer
expression. For example if a=15 and b=10 we have the following results:
a-b=5
a + b = 25
a * b = 150
a % b = 5 (remainder of division)
An arithmetic operation involving only real operands is called real arithmetic. A real operand
may assume values either in decimal or exponential notation.
33
Y = -2.0 / 3.0 = -0.666667
When one of the operands is real and the other is integer, the expression is mixed-mode.
For example:
15/10.0 = 1.5
Relational operators are used for comparisons. C supports six relational operations in all
which are tabulated below Table 2.4
Operator Meaning
== Is equal to
!= Is not equal to
The value of relational expression is either one or zero. It is one if specified relation is true
and zero if the relation is false.
For example:
10<20 is true
But
20<10 is false
|| meaning logical OR
34
! meaning logical NOT
The logical operators && and || are used when we want to test more than one condition. An
example is:
a > b && x == 10
An expression of this kind which combines two or more relational expressions is known as
logical expression or compound relational expression.
Truth Table
Non-zero Non-zero 1 1
Non-zero 0 0 1
0 Non-zero 0 1
0 0 0 0
35
Shorthand Assignment Operators
a=a+1 a+=1
a=a-1 a-=1
a=a*(n+1) a*=n+1
a=a%b a%=b
The increment and decrement operators are ++ and --.The operator ++ adds 1 to the operand,
while – subtracts 1. Both are unary operators and take the prefix and postfix form as follows
--m; or m--;
We use increment and decrement operators in for and while loop extensively. While ++m and
m++ mean the same thing when they form statements independently, they behave differently
when they are used in expressions on the right hand side of the assignment statement.
Consider the following:
m=5;
y=++m;
In this case, the value of y and m would be 6. Suppose if we rewrite the above statement as
m=5;
y=m++;
Then, the value of y would be 5 and m would be 6. A prefix operator first adds 1 to the
operand and then the result is assigned to the variable on left. On the other hand, a postfix
operator first assigns the value to the variable on the left and then increments the operand.
36
2.13.6 Conditional Operator
The operator ? : works as follows: exp1 is evaluated first. It is not a nonzero (true), then the
expression exp2 is evaluated and becomes the value of expression. If exp1 is false, exp3 is
evaluated and its value becomes the value of the expression. For example consider the
following:
a=10;
b=15;
In this example x will be assigned the value of b. this can be achieved by using if - else
statements as follows:
if (a> b)
x=a;
else
x=b;
2.13.7 Bitwise Operators
Bitwise operators are used to manipulate data at bit level. These operators are used for testing
the bits, or shifting them right or left. Bitwise operators may not be applied to float and
double. The following table illustrates the bitwise operators:
Bitwise Operators
Operator Meaning
| bitwise OR
^ bitwise exclusive OR
37
2.13.8 Special Operators
C supports some special operators such as comma operator, sizeof operator, pointer operators
(& and *) and member selection operators (. and ->).
The comma operator can be used to link the related expressions together. A comma linked list
of expressions is evaluated from left to right and the value of the right most expression is the
value of the combined expression.
For example:
First assigns the value 10 to x, then assigns 5 to y, and finally assigns 15 to the variable
Value. (i.e. 10 + 5) to value.
The sizeof operator is a compile time operator and, when used with an operand, it returns the
number of bytes the operand occupies. The operand may be a variable, a constant , an
expression or a data type qualifier.
Examples:
m=sizeof (sum);
n=sizeof (long int);
k=sizeof (123L);
The sizeof operator is normally used to determine the lengths of arrays and structures when
their sizes are not known to the programmer. It is also used to allocate the memory space
dynamically to variables during execution of a program.
38
/* Program for illustrating the operators
operator.c
*/
#include<stdio.h>
main()
{
int a=1,b=2,c=3,d;
d = c++;
Sample I/O
39
Expressions
ab-c a*b-c
(m+n)(x+y) (m+n)*(x+y)
(ab/c) a*b/c
3x2+2x+1 3*x*x+2*x+1
High priority: * / %
Low priority : + -
The basic evaluation procedure includes „two‟ left to right passes. During the first pass, the
high priority operators (if any) are applied as they are encountered. During the second pass,
the low priority operators (if any) are applied as they are encountered.
For example:
X=a-b/3+c*2-1
X=9-12/3*2-1
First pass
Step 1: x=9-4+3*2-1
Step 2: x=9-4+6-1
Second pass
Step 3: x=5+6-1
40
Step 4: x=11-1
Step 5: x=10
9-12(3+3)*(2-1)
Whenever parentheses are used, the expression within the parentheses assumes highest
priority. If two or more sets of parentheses appear one after another, the expression contained
in the left-most set is evaluated first and the right-most in the last. Given below are the new
steps:
First pass
Step 1: 9-12/6*(2-1)
Step 2: 9-12/6*!
Second pass
Step 3: 9-2*1
Step 4: 9-2
Third pass
Step 5: 7
Parentheses may also be nested, and in such case, evaluation of the expression will proceed
outward from the inner-most set of parentheses.
For example:
9-(12/ (3+2)*2)-1=4
If parentheses are nested, the evaluation begins with the innermost sub expression.
The associability rule is applied when two or more operators of the same precedence
level appear in the sub expression.
41
Arithmetic expressions are evaluated from left to right using the rules of precedence.
When parenthesis is used, the expressions within parenthesis assume highest priority.
During evaluation it adheres to very strict rules and type conversion. If the operands are of
different types the lower type is automatically converted to the higher type before the
operation proceeds. The result is of higher type.
42
If one of the operand is long int, the other will be converted to long int and the result
will be a long int.
If one operand is unsigned int the other will be converted to unsigned int and the
result will be unsigned int.
Conversion Hierarchy:
In C, any implicit type conversions are made from lower size type to higher size type
as shown below:
long double
Conversion
double
Hierarchy
float
long int
unsigned int
int
short char
However the following changes are introduced during the final assignment:
3. Long int to int causes dropping of the excess higher order bits.
43
floating point mode, thus retaining the fractional part of the result. The process of such a local
conversion is known as explicit conversion or casting a value. The general form is:
Uses of casts
Example Action
x=(int)1.3 1.3 is converted to integer by truncation
y=(int)2.1/(int)4.5 Evaluated as 2/4
z=cos((double)x) Converts x to double before using it
m=(int)(a+b) The result of a+b is converted to integer
n=(int)a+b A is converted to integer and then added to b
p=(double)sum/n Division is done in floating point mode
Table 2.9 Type casting
Each operator in C has a precedence associated with it. The precedence is used to determine
how an expression involving more than one operator is evaluated. There are distinct levels of
precedence and an operator may belong to one of these levels. The operators of higher
precedence are evaluated first.
The operators of same precedence are evaluated from right to left or from left to right
depending on the level. This is known as associativity property of an operator.
44
The table 2.10 given below gives the precedence of each operator:
45
Sample program 2.5
Write a C program to find the pH of a given solution for any given hydrogen ion
concentration. [ pH = - log[H+]
#include<stdio.h>
main()
{
float H,pH;
printf("Give H value: ");
scanf("%f",&H);
pH = - log10(H);
printf("\nH = %f pH = %f",H,pH);
}
Sample I/O
H = 0.000002 pH = 5.698970
46
Sample program 2.6
Write a C program to find the Body Mass Index (BMI) of a person
[ BMI = weight in kgs / height in metres2 ]
bmi = w / (h*h);
printf("Weight = %.2f",w);
printf("\nHeight = %.2f",h);
printf("\nBody Mass Index = %.5f",bmi);
}
Sample I/O
Weight = 50.00
Height = 1.60
Body Mass Index = 19.53125
47
Sample program 2.7
Write a C program to find the pH value for a given [OH-] concentration
[ pH = 14.0 – pOH and pOH = - log10(pOH) ]
/* program to calculate pH value
for a given [OH]- concentration
pH_value.c
*/
#include<stdio.h>
main()
{
float OH_con, pOH, pH;
printf("Give OH_con value: ");
scanf("%f",&OH_con);
pOH = -log10(OH_con);
pH = 14.0 - pOH;
printf("\nOH concentration = %f pOH = %f",OH_con, pOH);
printf("\npH = %f",pH);
}
Sample I/O
Give OH_con value: 0.04
48
Sample program 2.8
Write a C program to find the rpm value
[ rpm = 1000K√RCF / 11.17 r , where RCF is the Relative Centrifugal Force and the r is the
maximum radius]
#include<stdio.h>
main()
{
float rcf, rpm, radius;
printf("Give rcf, radius values: ");
scanf("%f %f",&rcf, &radius);
Sample I/O
49
Sample Program 2.9
Write a C program to compute RCF value
[ RCF = 11.17 * r_max * ( rpm/1000) ^2, where r_max is given in cn ]
Sample I/O
50
3
Control Statements in C
In the previous chapter, we have studied about the fundamental concepts of C. The sample
programs what we have studied require some simple formulae to be evaluated to obtain the
desired result. In all these programs, the statements are executed sequentially to obtain the
result and there is no conditional or control statements involved. This chapter, discuss about
the control statements required to solve problems in which conditions are involved.
Statements which alter the sequential order (or flow) of execution of a program based on
certain conditions are called „control‟ statements. There are two types of control statements
namely,
i. Branching statements
ii. Looping statements
(i) Branching statements: The statements which are used to execute a group of instructions
or statements upon satisfying some conditions are called branching statements. C has two
types of branching statements.
(ii) if – else if
(iii) Nested - if
Apart from this unconditional branching statement, some of the other branching statements or
functions available in C are given below:
51
break
exit()
continue
(ii) Looping statements: The statements which are used to execute a group of instructions or
statements repeatedly until some specific condition is satisfied are called looping statements.
The three looping statements available in C are given below
while
do – while
for
The simplest control statement is if – else statement. This statement is used to alter the flow
of execution of the program based on the comparison of two quantities.
if (expression)
statement-1;
else
statement-2;
The expression must be enclosed within the parentheses. In this case, the statement-1 will be
executed if the expression is satisfied; else statement-2 will be executed. If the statements are
compound statements, must be enclosed within the curly braces {}. The else part is optional
and need not be present always.
52
The if-else control statement results in a two-way branching. This can be described by the
flow chart as shown in Fig. 3.1.
Examples
1. if ( n % 2 == 0)
printf(“\n Given number %d is even “,n);
else
printf(“\n Given number %d is odd “,n);
53
Sample Program 3.1
Write a C program to find the bonus value and gift for the workers.
#include <stdio.h>
#include <ctype.h>
main()
{
char gender;
float bonus;
int salary;
Sample I/O
Bonus = 450.00
Gift as saree
54
Sample Program 3.2
/*
To find the number of purines and pyramidines
pcount.c
*/
#include <stdio.h>
main()
{
int a,t,g,c;
int purines,pyramidines;
if( a – t == 0)
{
purines = a + g;
pyramidines = t + c;
printf(“\n No. of purines = %d “, purines);
printf(“\n No. of pyramidines = %d”,
pyramidines);
}
else
{
printf(“\n a and t should have same value”);
}
}
Sample I / O
55
3.2.2 Conditional branching statements (ii) if- else if statement
The if- else if statement is used to putting ifs together when multipath decisions are involved.
A multipath decision is a chain of ifs in which the statement associated with each else is an if.
if ( condition 1)
Statement – 1;
else if ( condition 2)
Statement – 2;
else if ( condition 3)
Statement – 3;
Else
default – statement;
statement - x
If the statements are compound statements, must be enclosed within the curly braces {}.
This construct is known as the else if ladder. The conditions are evaluated from the top to
downwards. As soon as a true condition is found, the statement associated with it is executed
and the control is transferred to the statement-x (skipping the rest of the ladder). When all the
n conditions become false, then the final else containing the default statement will be
executed. This can be described by the flow chart as shown in Fig. 3.2.
Condition-1
T F
Statement-1 Condition-2
Statement-2 Condition-3
T F
Statement-3 Condition-n
Statement-n
F
Default
statement
T
Statement -x
56
F
Sample Example 3.3
/*
Find the nature of the given solution
phnature.c
*/
#include <stdio.h>
main()
{
float pH;
if( pH == 7.0)
printf(“\nGiven solution is Neutral”);
else if (ph > 0.0 && pH < 7.0)
printf(“\n Given solution is Acidic”);
else if (pH > 7.0 && pH <= 14.0)
printf(“\n Given solution is Basic “);
else
printf(“\n Invalid pH value”);
}
Sample I/O
Result 1:
Give pH value : 5.5
Result 2:
Give pH value : 7
57
Sample program 3.4
Write a c program to identify the glucose level in a blood
The glucose level is identified by <70 – hypoglycemia, 70-180 hyperglycemia,
> 180 diabetics
#include<stdio.h>
main()
{
int glucose;
Sample I/O
Diabetics
58
Sample program 3.5
Write a C program to find the anemic level
Anemic level is identified by the hemoglobin value. If the hemoglobin level is 9.6 – 13, mild;
8 – 9.5 , modeate; < 8 severe; > 13 and <=17 Normal
/*
To find the anemic level
anemic.c
*/
#include <stdio.h>
main()
{
float hl;
if ( hl < 8.0 )
printf(“\n Severe Anemic”);
else if ( hl < 9.5 )
printf(“\n Moderte”);
else if ( hl <= 13.0 )
printf(“\n Mild Anemic”);
else if ( hl <= 17.0 )
printf(“\n Normal”);
else
printf(“Invalid hemoglobin level”);
}
Sample I/O
Severe Anemic
59
3.2.3. Conditional branching statements (iii) Nested if statement
In C, control statements can be nested. In a C program segment, one if-else statement can be
completely embedded into another and so on.
if ( condition 1)
if ( condition2)
Statement – A
else
Statement – B
else
if ( condition3)
Statement – C
else
Statement – D
If the condition1 is true then it check for condition2. If both are true statement-A will be
execute or Statement-B will execute. If condition1 if false, the control will move to
condition3. If it is true statement-C will execute otherwise statement-D will execute. This can
be described by the flow chart as shown in Fig. 3.3.
60
FFT If cond.1
?
F T
If cond. 3 If cond.2
T T
?
F F Statement A
Statement C
Statement D Statement
T B
Statement x
61
Sample program 3.6
Write a C program to compare two peptides and determines the type of the peptide. i.e small
or poly based on the values inputted peptides.
Amino acid range for small peptide is < 1 - 8> whereas it is < 9 – 50 > for a poly peptide.
/*
Compare two peptides
acid_type.c
*/
#include<stdio.h>
main()
{
int peptide1,peptide2;
if ( peptide1 <=8 )
if ( peptide2 <= 8 )
printf(“\n Entered amino acid are small
peptides”);
else
printf(“\n Entered amino acid are both
small and poly peptides”);
else
if ( peptide2 <= 8 )
printf(“\n Entered amino acid are both
poly and small peptides”);
else
printf(“\n Entered amino acid are poly
peptides”);
}
Sample I/O
62
Sample program 3.7
Write a C program to determine which among three values has the maximum value. This can
be used while aligning amino acid sequences to find the maximal score value among
diagonal, left and above cell scores.
/*
To determine the maximum score among diagonal, left and
above
max_score.c
*/
#include <stdio.h>
main()
{
int diagonal, left, above;
Sample I/O
63
3.2.4 Conditional branching statements (iv) the switch-case statement
The if control statement works well when decisions are to be made from few alternatives.
However, if there are too many alternatives to select from, the if-else structure is too tedious
and confusing. In such cases the switch function is used. It works as a multi-way decision-
making tool to test whether various alternatives satisfy the conditional expression.
switch ( expression )
case value-1 :
block1;
break;
case value-2 :
block2;
break;
……
……
default :
default block;
Statement – x;
The expression specified in the switch is compared with case value-1, case value-2 and so on.
If any of the conditions is satisfied then that particular block is executed. The break statement
at the end of each block signals the end of the particular case and causes the program to exit
from the switch statement. The control is then transferred to the statement-x following the
switch. The default statement is optional if it is not present, and all case matches fail, no
action takes place and control is transferred to the statement-x. Figure 3.4 briefly describes
the flowchart diagram of switch-case statement where exp represents the expression, v1, v2,
v3 … vn represent the value-1, value-2 … value-n respectively and b1, b2, b3 …. bn
64
represent the block1, block2, block3 … blockn, respectively. This can be described by the
flow chart as shown in Fig. 3.4 as follows.
If ( exp)
The break statement is used to exit from the switch-case, thus preventing more than one block
being executed. If the break statement is not used, all the case blocks will be executed even if
one case is true. In general break statement is used to terminate all the conditional statement.
The general form of break statement is,
break;
The exit() function is used to terminate the program execution. The normal ending of the
execution of the program will return a zero value. For any other type of ending, due to errors,
may signal the integer value specified within the parentheses of the exit() function. The
general format of the exit() function is,
exit();
The exit() function is commonly used with the default block in a switch-case statement, so as
to terminate the execution of the program, if wrong values are detected for the case values.
65
Difference between if statement and switch statement
66
Sample Program 3.8
Write a C program to determine the type of DNA depending on the base number.
main()
{
int base_no;
switch(base_no)
{
case 10 :
printf(“\nType of DNA is B”);
break;
case 11:
printf(“\nType of DNA is A”);
break;
case 12:
printf(“\nType of DNA is Z”);
break;
default :
printf(“\n No valid DNA type for this
base number”);
printf(“\n Enter base <12 or 10 or
11> and try again”);
}
}
Sample I/O
Result 1:
Enter number of bases in DNA : 11
Type of DNA is A
Result 2:
Enter number of bases in DNA : 1
No valid DNA type for this base number
Enter base <12 or 11 or 10> and try again
67
Sample program 3.9
#include<stdio.h>
#include<ctype.h>
main()
{
char x;
switch(tolower(x))
{
case „a‟ :
case „e‟ :
case „i‟ :
case „o‟ :
case „u‟ :printf(“ \n %c is a vowel”,x);
break;
default :
printf(“ \n %c is a consonant”,x);
}
}
Sample I/O
b is a consonant
68
Sample program 3.10
Write a C program to create amino acid dictionary
#include<stdio.h>
main()
{
char aa;
printf("Enter the starting letter of the amino acid : ");
scanf("%c",&aa);
printf(“Amino acid starts with %c are \n”);
switch(aa)
{
case 'a':
case 'A': printf("alanine\n");
printf("arginine\n");
printf("asparagine\n");
printf("aspartic acid\n");
break;
case 'c':
case 'C': printf("cysteine\n");
break;
case 'g':
case 'G': printf("glutamine\n");
printf("glutamic acid\n");
printf("glycine\n");
break;
case 'h':
case 'H': printf("histidine\n");
break;
case 'i':
case 'I': printf("isolucine\n");
break;
case 'l':
case 'L': printf("lucine\n");
printf("lycine\n");
break;
69
case 'm':
case 'p':
case 'P': printf("phenylalanine\n");
printf("proline\n");
break;
case 's':
case 'S': printf("serine\n");
break;
case 't':
case 'T': printf("threonine\n");
printf("tryptophan\n");
printf("tyrosine\n");
break;
case 'v':
case 'V': printf("valine\n");
break;
default:
printf("wrong input\n");
}
}
Sample I/O
70
3.3 LOOPING STATEMENTS
The conditional statements execute a statement or block of statements once based on the
condition. Sometimes it may require executing a statement or block of statement more than
once. Looping statements are used for this purpose. The while, do – while and for statements
are the example of looping statements.
Looping statement
Test condition is to check whether the statement or block has been repeated or not. The
statement or block is repeated if the condition is true.
Sentinel loops
Loops are classified as fixed loops and sentinel loop. If the repetition time is known then it is
referred as counter-controlled loop or definite repetition loop. The for loop is an example of
this definite loop.
If the number of execution of the loop is not fixed then it is referred as sentinel-controlled
loop. The repetition is based on the special value called sentinel value. For example for
reading data we can indicate the end of data by a special value -1 or any negative data. The
control variable is called sentinel variable. Example for this indefinite repetition loop is while
and do –while statements.
The simplest looping structure is the while statement. This construct is also called as pre-
tested looping statement.
71
The while is the entry controlled loop here. The test-condition is evaluated first. If the
condition is true, body of the loop is executed. After executing the body of the loop, the test
condition is once again evaluated. If it is true, body of the loop is repeated again. This process
continues until the condition becomes false.
Loop Entry
Test F
condition
T
Statement /
Exit loop
block
Control variable
T
Examples
i = 1; // initialization
while ( i < = 15 ) // testing
{
printf(“%d,”,i);
i++; // incrementing
}
72
(ii) Another example which uses the keyboard input
character = „ „;
The while statement executes the statements, if the condition is true. So there may be the
possibility of the statement not being executed at least once. Sometimes it may require
executing the body of the loop before testing the condition. The do-while statement is used
for handling these situations.
This statement is referred as post tested looping statement. The minimum number of
execution of do-while statement is 1.
do
Without checking any condition the body of the loop is executed first. After that the condition
is evaluated. If the condition is true, the body of the loop is executed again otherwise the loop
terminates.
73
The flow diagram of while loop is given in Fig.3.5 as follows:
Loop entry
Statement
block
Counter variable
T Test F
condition
Loop exit
choice = „y‟;
do
{
printf(“Welcome to Bioinformatics \n”);
T to print again :”);
printf(“Do you want F
choice = getchar();
} while ( choice == „y‟);
The above structure prints Welcome to Bioinfomatics again and again if the choice is y .
74
3.3.3. The for statement
The for statement executes a statement or block statements for a certain number of times.
Loop statements;
Where expression-1 is used to initialize some parameter, expression-2 represents the test
condition and expression-3 is used to alter the value of the parameter initially assigned by
expression-1.
The flow diagram of the for loop is given in Fig 3.6 as follows.
Here i indicate the counter variable
Example
75
By using for loop we can also execute the infinite loop. The format is as follows
for ( ; ; )
{
………..
………..
}
The break statement is used to terminate the infinite loop.
#include <stdio.h>
main()
{
int n,r;
while ( n > 0 )
{
r = n % 10;
printf("%d",r);
n = n / 10;
}
}
Sample I/O
76
Sample program 3.12
#include <stdio.h>
main()
{
int n,bn,r,sum=0;
while ( n > 0 )
{
r = n % 10;
sum += r;
n = n / 10;
}
printf("\n Sum of digits of %d = %d",bn,sum);
}
Sample I/O
77
Sample program 3.13
#include <stdio.h>
main()
{
int n,bn,r,sum=0;
while ( n > 0 )
{
r = n % 10;
sum += r * r * r;
n = n / 10;
}
if ( bn == sum)
printf("\n %d is a armstrong number",bn);
else
printf("\n %d is not a armstrong number",bn);
}
Sample I/O
78
Sample program 3.14
Write a C program to count the number of base character entered through the keyboard using
do-while statement
#include <stdio.h>
main()
{
char base,choice;
int count=0,n=0;
do
{
printf("Enter the character : ");
scanf(" %c",&base);
switch(base)
{
case 'a' :
case 'g' :
case 'c' :
case 't' :
case 'u' : count++;
}
n++; /*To count the total number of character*/
Sample I/O
79
Enter the character : c
Do U want to enter another character (y/n) : y
Enter the character : t
Do U want to enter another character (y/n) : n
#include <stdio.h>
main()
{
int n,f,i;
for(i=1,f=1;i<=n;i++)
{ f *= i; }
printf("\n %d!=%d",n,f);
Sample I/O
6! = 720
80
Sample program 3.16
#include<stdio.h>
main()
{
int n,i,a,b,c;
a = -1;
b = 1;
for(i=1; i<=n; i++)
{
c = a + b;
printf("%d\t",c);
a = b;
b = c;
}
}
Sample I/O
81
Sample program 3.17
#include <stdio.h>
main()
{
int n,flag=1,i;
if (flag)
printf("\n %d is a prime number",n);
else
printf("\n %d is not a prime number",n);
}
Sample I/O
Result 1:
Enter the number to check for prime : 13
13 is a prime number
Result 2:
Enter the number to check for prime : 25
25 is not a prime number
82
The continue statement
The continue statement is used to skip a part of the statement block execution. The break
statement terminates the loop. The continue statement causes the loop to continue with the
next iteration by skipping all or any statement after it.
continue;
Example
for ( i=1;i<=10;i++)
{
if ( i ==7)
continue;
printf(“%d\t”,i);
}
#include<stdio.h>
#include<math.h>
main()
{
int mol_no,counter=1;
float residual_value, residual_sqrt;
while(counter < 3)
{
printf("\nEnter a molecule number : ");
scanf("%d",&mol_no);
if(mol_no < 0)
{
printf("Molecule Number should always be
positive\n");
break;
}
83
printf("Enter residual value for the given
molecule number :");
scanf("%f",&residual_value);
if(residual_value < 0)
{
printf("Enter positive number for finding
sqrt \n");
continue;
}
residual_sqrt = sqrt(residual_value);
printf("SQRT of residual value: %.3f <for molecule number
%d> is %.3f\n", residual_value, mol_no, residual_sqrt);
counter++;
}
}
Sample I/O
84
4
Arrays
An array is a sequenced collection of related data items that share a common name. Each
storage location in an array is called an array element. Individual array elements can be
accessed by its index / subscript value. There are two kinds of arrays.
1. Static array: This kind of array contains a fixed number of elements and is allocated
when the program is compiled.
2. Dynamic array: This kind of array contains a dynamic number of elements decided by
the dynamic memory management when the program is run. The array is resized as
per requirement.
Example
int genes[20];
char nucleosides[10];
Arrays are not only used to represent simple lists, it also used to represent tables of data in
two, three or more dimensions. Depending on the dimensions, arrays are classified as,
One dimensional array has only one subscript or index. The index is used to identify an array
element. The index value starts from 0.
Example
int marks[5];
85
4.1.2 Memory Representation
The one dimensional array consists of a set of contiguous memory location and each
locations is accessed by an offset value from the first location.
The following fig. 4.1 shows the hierarchical memory representation of one dimensional
array
A[9]
The size required to store an array in a memory is the product of size of datatype of an array
and the size of an array.
Size = 2 * 10 = 20 bytes.
After the array declaration, its elements must be initialized. An array can be initialized as
either compile time or run time.
In the compile time initialization, the array values are assigned during the declaration time
itself.
Example
If the number of elements are less than the array size the value 0 will be assigned to the
remaining positions.
86
float hemoglobin[5] = { 7.6, 12.5, 11.2 };
will initialize the first three elements to 7.6, 12.5, 11.2 and the remaining two elements to 0.0.
The size of the array may be omitted. In such case the compiler allocates the space for all
initialized array elements.
An array can be explicitly initialized at run time. This approach is usually applied for
initializing large arrays. For example, the following C segment,
{ A[i] = i*5; }
The read function such as scanf also used to initialize an array. For example, the statements
int molecules[3];
{ scanf (“%d”,&molecules[i]); }
will initialize array elements with the values entered through the keyboard.
87
Sample program 4.1
Write a C program to declare an array containing the ORF (Open Reading Frame is the part
of an organism‟s genome containing the sequence of bases that could potentially encode a
protein) lengths of various genes and initialize it.
88
Sample program 4.2
Write a C program to count the number of +ves,-ves and 0‟s energy molecules stored in an
array
#include <stdio.h>
main()
{
pe=ne=ze=ec=0;
89
printf("\n\n Number og energy molecules are %d",ec);
printf("\n Number of positive energy molecules are
%d",pe);
printf("\n Number of negative energy molecules are
%d",ne);
printf("\n Number of zero energy molecules are %d",ze);
Sample I/O
/*
Find the average base count
base_avg.c
*/
#include <stdio.h>
main()
{
char base[4] = { 'a', 't', 'g', 'c' };
char bmax,bmin;
int base_count[4],count;
int max,min,avg,btotal=0;
90
max = base_count[0];
min = base_count[0];
btotal += base_count[count];
}
Sample I/O
91
Sample program 4.4
/* sort n characters
csort.c
*/
#include <stdio.h>
main()
{
char x[10],t;
int n,i,j;
printf("\nEnter %d characters\n",n);
for(i=0; i<n; i++)
scanf(" %c",&x[i]);
for(i=0;i<n-1;i++)
for(j=0;j<n-(i+1);j++)
if( x[j] > x[j+1])
{
t = x[j];
x[j] = x[j+1];
x[j+1] = t;
}
92
Sample I/O
Result1:
Enter the number of characters to sort : 12
Maximum size of the array is 10. Try again
Result2:
Enter the number of character to sort :5
Enter 5 characters
i
w
c
k
a
The simplest form of multidimensional array is two dimensional array. It is generally used to
represent the table values. This array uses two indices. The first index represents the row size
and the second index represents the column size.
Example
int marks[4][3];
93
4.2.2 Memory Representation
The following Fig. 4.2 represents the array of 3 rows and 3 columns memory representation
Row0
Row1
Row2
The size of the two dimensional array is calculated by using the formula,
Size = row_size * column_size * sizeof(datatype)
For example the size of integer array with 3 rows and 4 columns, int a[3][4] is
Size = 3 * 4 * 2 = 24 bytes
Just like one dimensional array, two dimensional array values can also be initialized in either
compile time or run time.
In the compile time initialization, the array values are assigned during the declaration time
itself.
Example
To represent two students three subject marks the two dimensional array with 2 rows and 3
columns are required. The representation is int marks[2][3] ;
94
The values can be initialized as
int Marks[2][3] = {40,60,56,73,79,80}; (or)
int Marks[2][3] = { {40,60,56} , {73,79,80} };
will assign,
Marks[0][0] = 40; Marks[0][1] = 60; Marks[0][2] = 56;
Marks[1][0] = 73; Marks[1][1] = 79; Marks[1][2] = 80;
If the number of data elements are less than the size, the value 0 will be assigned for the
remaining locations.
To initialize an array at run time, usually applied for large arrays, can be done by the
following C segment,
int Marks[2][3];
The read function such as scanf also used to initialize an array. For example, the statements
int Marks[2][3];
for ( student = 0; student <2; student++)
for ( subject=0; subject<3; subject++)
{
scanf(“%d”,&Marks[student][subject] );
}
Initializes array elements with the values entered through the keyboard.
95
Sample Program 4.5
Write a C program to accept a m * n matrix from the keyboard and display its transpose
#include<stdio.h>
main()
{
int matrix[5][5],m,n,row,col;
96
Sample I/O
Result1:
Enter the order of the matrix m*n : 6*2
Max. matrix size is 5*5
Try Again
Result2:
Enter the order of the matrix m*n : 3*2
Given matrix is
1 3
2 8
0 5
datatype arrayname*size1+*size2+*size3+…………*sizen+;
Example
int m[4][3][6][5];
In multidimensional arrays, it takes the computer time to compute each index. This means
that accessing an element in a multidimensional array can be slower than accessing an
element in a single-dimensional array.
97
5
Functions
This chapter discuss about the important features of C language namely functions, which are
very much useful for the structured programming aspect of C programs.
Debugging is easier
It is easier to understand the logic involved in the program
Testing is easier
Recursive call is possible
Irrelevant details in the user point of view are hidden in functions
Functions are helpful in generalizing the program
The functions which are already predefined by C are called as built-in/library functions. The
functions are grouped into different header files. Some of the header files and the collection
of library functions are given below.
1. math.h - contains all mathematical functions
2. stdio.h - contains all standard i/o functions such as scanf, printf etc.
3. char.h - contains all character functions
4. string.h - contains all string functions
98
5.3.1.1 MATHEMATICAL FUNCTIONS
All the mathematical functions are included in math.h header file. So before using any one of
the mathematical functions in a program, one should include the line:
# include <math.h>
The following Table 5.1 lists some of the standard mathematical functions
Function Meaning
cos(x) cosine of x
sin(x) sine of x
tan(x) Tangent of x
abs(x) Absolute value of x
abs(-50) = 50
mod(x,y) Remainder of x/y
Mod(10,3) = 1
exp(x) E to the power x (ex)
log(x) Natural log of x, x>0
log10(x) Base 10 log of x, x>0
pow(x,y) X to the power y (xy)
sqrt(x) Square root of x, x>=0
ceil(x) X rounded up to the nearest integer
Ceil(12.2) = 13.0
floor(x) X rounded down to the nearest integer
Table 5.1 Mathematical functions
#include<stdio.h>
#include<math.h>
main()
{
float OH_con,pOH,pH;
printf("Give OH_con value: ");
scanf("%f",&OH_con);
99
printf("\n OH concentration = %f\n pOH = %f \n pH =
%f",OH_con, pOH, pH);
}
Sample I/O
OH concentration = 0.030000
pOH = 1.522879
pH = 12.477121
The character functions are useful for testing and transforming the characters. All the
character functions are included in the ctype.h header file. This header file must be included
for using any one of the character functions.
Function Meaning
isalpha( x) Returns true if x is a letter
isupper( x) Returns true if x is a upper case letter
islower( x) Returns true if x is a lower case letter
isdigit( x) Returns true if x is a digit [0-9]
isalnum(x) Returns true if x is an alphanumeric
character
isspace(x) Returns true if x is space, tab, return,
newline or vertical tab character
tolower(x) Convert the uppercase letter to lowercase
letter
toupper(x) Convert the lowercase letter to uppercase
letter
100
Sample program 5.2
Write a C program to accept a single character from keyboard and identify whether it is a
letter or digit. If it is a letter change into opposite case
#include<stdio.h>
#include<ctype.h>
main()
{
char x;
if(isalpha(x))
if(islower(x))
printf("Uppercase of %c is %c",x,toupper(x));
else
printf("Lowercase of %c is %c",x,tolower(x));
else if(isdigit(x))
printf("%c is a digit",x);
else
printf("%c is a symbol",x);
}
Sample I/O
Result1:
Enter any character: g
Uppercase of g is G
Result2:
Enter any character: 8
8 is a digit
Result3:
Enter any character: &
& is a symbol
101
5.3.2 User defined functions
The functions which are developed by the user are called user defined functions. These
functions are created depends on the need of the user. All user defined functions can be
compiled separately.
a. Function Definition
A function definition, also known as function implementation shall include the following
elements:
1. Function name
2. Function type
3. List of parameters
4. Local variable declarations
5. Function statements and
6. A return statement
All the six elements are grouped into two parts, namely,
Function Header
The function header consists of three parts: the function type (also known as return type), the
function name and the formal parameter list. Semicolon is not used at the end of the function
header.
The function type specifies the type of value (like float or double) that the function is
expected to return to the program calling the function. If the return type is not explicitly
specified, C will assume that it is an integer type. If the function is not returning anything
then we need to specify the return type as void. The void is one of the fundamental data type
in C. The value returned is the output produced by the function.
The function name is any valid C identifier and therefore must follow the same rules of
formation as other variable names in C.
The parameter list declares the variable that will receive the data sent by the calling program.
They serve as input data to the function to carry out the specified task. Since they represent
actual input values, they are often referred to as formal parameters. These parameters can
also be used to send values to the calling programs. The parameters are also known as
arguments.
The parameter list contains declaration of variables separated by commas and surrounded by
parentheses. There is no semicolon after the parantheses.
Examples:
Declaration of parameter variables cannot be combined. That is, void swap(int x,y) is
illegal. A function need not always receive values from the calling program. In this situation
void keyword is used to represent the formal parameters are empty.
Example
void printline(void)
{
…
…
}
103
This function neither receives any input values nor returns back any value. Many compilers
accept an empty set of parentheses, like void printline(). But it is good programming style to
use void to indicate a null parameter list.
Function Body
The function body contains the declarations and statements necessary for performing the
required task. The body enclosed in braces, contains three parts, in the following order
1. Local declarations that specify the variables needed by the function
2. Function statements that perform the task of the function
3. A return statement that returns the value evaluated by the function
The return statement is not required if the function does not return any value.
The function can return only one value in one function call. The general form of the return
statement is
return;
(or)
return (expr);
Example
return; // does not return any value
return(a) // return the value of an identifier a
return( a*b );
b. Function calls
The function can be called by the function name followed by the list of actual parameters.
Example main()
int y;
y = sum(15,-10);
printf(“Sum = %d”,y);
When the compiler encounters the function call, the control is transferred to the function
sum(). This function is then executed line by line as described and a value is returned when a
return statement is encountered. This value is assigned to y. This is illustrated below.
104
main()
int y;
} {
return x+y;
The function call sends two integer values 15 and -10, which are assigned to x and y
respectively. The function computes the value and returns the value 5 to the main where it is
assigned to y.
If the actual parameters are more than the formal parameters, the extra actual arguments will
be discarded. On the other hand, if the actual are less than the formals, the unmatched formal
arguments will be initialized to some garbage. Any mismatch in data types may also result in
some garbage values.
c. Function Declaration
Like variables, all functions in a C program must be declared, before they are invoked. A
function declaration (also known as function prototype) consists of four parts.
Function type
Function name
Parameter list
Terminating semicolon
For example, sum function defined in the previous section will be declared as:
int sum( int x, int y); /* function prototype */
105
5.4 CATEGORY OF FUNCTIONS
A function, depending on whether arguments are present or not and whether a value is
returned or not, may belong to one of the following categories:
Example
#include<stdio.h>
main()
{
void drawLine(char,int);
drawLine('*',15);
drawLine('#',10);
drawLine('-',20);
drawLine('@',5);
}
for(i=0;i<n;i++)
printf("%c",x);
printf("\n");
return;
}
Sample I/O
***************
##########
--------------------
@@@@@
106
Sample program 5.4
Write a user defined function to search an element in an array
#include <stdio.h>
main()
{
int a[20],n,i,se,p;
int Linear_Search(int[],int,int);
printf("Enter %d elements\n",n);
for(i=0;i<n;i++)
scanf("%d",&a[i]);
printf("Enter the searching element: ");
scanf("%d",&se);
p = Linear_Search(a,n,se);
if(p>0)
printf("\nElement found at position %d",p);
else
printf("\nElement not found");
}
107
Sample I/O
Result1:
Enter the number of elements : 25
Max. size of an array is 20
Try Again
Result2:
Enter the number of elements : 5
Enter 5 elements
4
90
-5
23
0
Enter the searching element: 15
Result3:
Enter the number of elements : 6
Enter 6 elements
50
-70
45
92
175
36
Enter the searching element: -70
5.5 RECURSION
108
Sample program 5.5
Write a C program to find nCr value using recursion
#include<stdio.h>
main()
{
int n,r,nf,rf,nrf,ncr;
int fact(int);
if (n>=r)
{
nf = fact(n);
rf = fact(r);
nrf = fact(n-r);
ncr = nf / (rf * nrf);
printf("\n%dC%d = %d",n,r,ncr);
}
else
{
printf("\nThe r value must be less than n");
printf("\nTry Again");
}
}
int fact(int n)
{
if( n ==0 || n==1)
return 1;
else
return n * fact(n-1);
}
Sample I/O
109
5.6 THE SCOPE, VISIBILITY AND LIFETIME OF VARIABLES
All the variables in C have the storage class. The storage class is used to describe the scope,
visibility and life time of variables. The scope of the variable determines over what region of
the program a variable is actually available to use (active). The visibility refers to the
accessibility of a variable from the memory. The life time of the variable refers to the period
during which a variable retains a given value during execution of a program (alive).
Automatic variables
External variables
Static variables
Register variables
The variables may also be broadly categorized as internal (local) or external (global)
depends on the place of their declaration. The variables which are declared inside the
function is called as internal variable and those are declared outside the function are called as
external variables.
Automatic variables are the variables which are declared inside the function. They are created
when the function is called and destroyed automatically when the function is exited, hence
the name automatic. Automatic variables are therefore local to the functions and also called
as local or internal variables.
A variable declared inside a function without any storage class specification is, by default, an
automatic variable. It may also use the keyword auto to declare automatic variable explicitly.
Example
main()
{
auto int x;
-------
-------
}
The important feature of automatic variable is that their value cannot be changed in some
other function in the same program. This assures that the same variable can be used in
different function in the same program without causing confusion to the compiler.
110
Sample program 5.6
Write a C program to illustrate the automatic storage class of a variable
#include<stdio.h>
main()
{
auto int chromosomes = 36;
void change_chromosome();
printf("Number of chromosomes\n");
printf("\nBefore function call: %d",chromosomes);
change_chromosome();
printf("\nAfter function call: %d",chromosomes);
}
void change_chromosome()
{
int chromosomes = 50;
printf("\nInside the function: %d",chromosomes);
}
Sample I/O
Number of chromosomes
Variables that are both alive and active throughout the entire program are known as external
variables. They are also known as global variables. Global variables can be accessed by any
function in the program. The variables which are declared outside the function are referred to
as external variables by default.
111
main()
{
extern int y; // external variable – explicit declaration
-------------
-------------
}
/* To illustrate the extrnal storage class
extern.c
*/
#include<stdio.h>
char blood_gp; /* global variable */
main()
{
void change_blood_gp();
blood_gp='A';
printf("Blood Group\n");
printf("\nBefore function call: %c",blood_gp);
change_blood_gp();
printf("\nAfter function call: %c",blood_gp);
}
void change_blood_gp()
{
blood_gp='B';
printf("\nInside the function: %c",blood_gp);
return;
}
Sample I/O
Blood Group
Before function call: A
Inside the function: B
After function call: B
112
5.6.3. STATIC VARIABLES
The value of static variables persists until the end of the program. A variable can be declared
static using the keyword static like
static int x;
A static variable may be either an internal type or an external type depending on the place of
declaration.
Internal static variables are those which are declared inside a function. The scope of internal
static variables extend up to the end of the function in which they are defined. Therefore,
internal static variables are similar to auto variables, except that they remain in alive
throughout the remainder of the program. Therefore, internal static variables can be used to
retain values between function calls. For example, it can be used to count the number of calls
made to a function.
#include<stdio.h>
main()
{
int i;
void degradation();
void degradation()
{
int static chains=30;
printf("%d\n",chains);
chains=chains-2;
113
Sample I/O
Register variable is used to keep the variable in one of the machine‟s registers, instead of
keeping in the memory. A register access is much faster than a memory access, keeping the
frequently used variables in register lead to faster execution of the programs. This is achieved
as follows;
register int x;
Although, ANSI standard does not restrict its application to any particular data type, most
compilers allow only int or char variables to be placed in the register. Only a few variables
can placed in the register. However, C will automatically convert registers variables into non-
register variables once the limit is reached.
114
6
Strings
A string is a sequence of characters that is treated as a single data item. A string is terminated
by a null character. The null character serves as the “end-of-string” marker.
C does not support string as a data type. It allows strings to represent as a character array. In
c, string is declared as a character array. The general form of string declaration is as follows:
char string_name[size];
The size determines the number of characters assigned to the string variable.
Example
char sequence[20];
char acid_name[10];
When the compiler assigns a character string to a character array, it automatically supplies a
null character(„\0‟) at the end of the string. Therefore, the size should be equal to the
maximum number of characters in the string plus one.
The size of the array is declared as 8, because the alanine length is 7 plus one is assigned for
the null character.
The size is also optional while initializing array values. In that situation the null character
also need to be initializing explicitly.
Example
Char acid_name[] = { „g‟,‟l‟,‟u‟,t‟,‟a‟,‟m‟,‟i‟,‟n‟,‟e‟,‟\0‟ };
C does not allow to separate the initialization from declaration. That is,
char seq[5];
seq = “agtc”; // is not allowed.
115
Similarly C does not allow string assignment also.
Example
char food[8]=”protein”;
char pulses[8];
pulses = protein; // is not allowed
The standard input function scanf function with %s as control string is used to read strings
from the terminal.
Example
char carbo_hydrate[15];
scanf(“%s”,carbo_hydrate);
The scanf function terminates its input on the first white space it finds. A white space
included blanks, tabs, carriage returns, from feeds, and new lines. Therefore if the following
line is typed in at the terminal,
brown rice
then only the string “brown” will be read into the array carbo_hydrate. While reading strings,
the ampersand (&) symbol is not required in the scanf function. The array name itself points
to the memory location.
To avoid this problem gets function is used to read strings along with the white space. This
function is available in stdio.h header file. The general form of gets function is
gets(string);
The printf function and puts functions are used to print strings to screen. The general format
of above function is as follows:
printf(“%s”,string_name0;
puts(string_name);
Example
printf(“%s”,carbo_hydrate);
puts(carbo_hydrate);
116
6.5 READING A LINE OF TEXT
C supports a format specification known as edit set conversion code %[..] that can be used to
read a line containing a variety of characters, including whitespaces.
Example
char line[80];
scanf(“%[^\n]”,line);
printf(“%s”,line);
will read a line of input from the keyboard and display the same on the screen.
The following table Table 6.1 lists the most commonly used string functions. All the string
functions are stored in string.h header file.
Function Meaning
int strlen(string) Returns the number of characters in the string
before „\0‟
strcpy(string1,string2) Copies the string2 into the string1
strncpy(string1,string2,size) Copies at most n characters of the string2 into the
sting1
strcat(string1,string2) Appends the string2 to the end of the string1
strncat(string1,string2,size) Appends at most n characters if the string s2 to the
end of the string1
strchr(string,char) Returns a pointer to the first instance of char in
c.Returns a NULL pointer if char is not in the
string
strcmp(string1,strin2) Compares string1 and strng2. The function returns
0 if they are the same, a number <0 if s1<s2, a
number >0 if s1>s2
strncmp(string1,string2,size) compares up to n characters of the string s1 to the
string s2. The function returns 0 if they are the
same, a number < 0 ifs1 < s2, a number > 0
if s1 > s2.
strstr(string1,string2) Returns a pointer to the first instance of string s2 in
s1. Returns a NULL pointer if s2 is not in s1.
strrev(string) Reverse the string
117
Sample Program 6.1
Write a C program to compute base percentage of the sequence
#include<stdio.h>
main()
{
int i,a,g,c,t,gap;
float ac,gc,cc,tc;
char seq[50];
a=g=c=t=gap=0;
for (i=0;seq[i]!='\0';i++)
{
if (seq[i]=='a')
a++;
else if (seq[i]=='c')
c++;
else if (seq[i]=='t')
t++;
else if (seq[i]=='g')
g++;
}
118
printf("\nPERCENTAGE OF \n");
printf("a = %3.2f\nc = %3.2f\ng = %3.2f\nt = %3.2f",
ac,cc,gc,tc );
Sample I/O
Base a=3
Base c=4
Base t =3
Base g=3
PERCENTAGE OF
A = 23.08
C = 30.77
G = 23.08
T = 23.08
119
Sample Program 6.2
Write a C program to count the number of gaps in the given sequence
#include<stdio.h>
main()
{
char seq[25];
int gaps;
int Gap_Count(char[]);
for(i=0;seq[i] != '\0';i++)
{
if (seq[i] == '-' )
count++;
}
return count;
}
Sample I/O
120
Sample Program 6.3
Write a c program to convert the DNA sequence to RNA sequence
#include<stdio.h>
#include<ctype.h>
main()
{
int i;
char seq[20];
for(i=0;seq[i]!='\0';i++)
{
seq[i]= toupper(seq[i]);
if(seq[i]=='A')
seq[i]='u';
if(seq[i]=='T')
seq[i]='a';
if(seq[i]=='G')
seq[i]='c';
if(seq[i]=='C')
seq[i]='g';
}
Sample I/O
RNA sequence is
ucgaugcaucga
121
Sample Program 6.4
Write a C program to find the reverse complement of the given sequence
#include<stdio.h>
#include<string.h>
#include<ctype.h>
main()
{
void reverse (char[]);
char seq[60], new[60];
int i,j,l;
for(i=0;seq[i]!='\0';i++)
{
if(islower(seq[i]))
seq[i]=toupper(seq[i]);
}
reverse(seq);
printf("\n\n");
}
122
for(i=0;seq[i]!='\0';i++)
{
switch(seq[i])
{
case 'A':
seq[i]= 'T';break;
case 'T':
seq[i]= 'A';break;
case 'G':
seq[i]= 'C';break;
case 'C':
seq[i]= 'G';break;
default:
seq[i]= '-';
}
}
l= strlen(seq);
for(i=l-1,j=0;seq[i]!='\0';i--,j++)
{
new[j]= seq[i];
}
new[j]= '\0';
printf("\tTRANSLATED SEQ.: ");
puts(new);
printf("\n\t'-' represents the gaps in the Query
sequence.");
}
Sample I/O
>>> R E V E R S E C O M P L E M E N T O R <<<
123
Sample Program 6.5
Write a C program to perform case conversion
#include<stdio.h>
#include<ctype.h>
main()
{
char string[20];
void lowercase(char[]);
void uppercase(char[]);
int choice;
printf("\tCASE CONVERSION");
printf("\n1.Lowercase");
printf("\n2.Uppercase");
printf("\n3.Exit\n");
do
{
printf("\nEnter the choice: ");
scanf("%d",&choice);
switch(choice)
{
case 1 : printf("Enter the string in
uppercase\n");
scanf("%s",string);
lowercase(string);
break;
default : return;
}
124
void lowercase(char s[])
{
int i;
for(i=0;s[i]!='\0';i++)
s[i]=tolower(s[i]);
printf("Lower case string is\n");
puts(s);
}
Sample I/O
CASE CONVERSION
1.Lowercase
2.Uppercase
3.Exit
125
Sample program 6.6
Write a C program to check the given sequence is palindrome sequence or not
#include<stdio.h>
#include<string.h>
main()
{
char seq[40];
int palindrome(char[]);
int p;
p=palindrome(seq);
if(p)
printf("\n%s\nis a palindrome sequence");
else
printf("\n%s\nis not a palindrome sequence");
Sample I/O
Result1
Enter the sequence
TAGCAACGAT
TAGCAACGAT
is a palindrome sequence
126
Result2
Enter the sequence
TCTAGACTGA
TCTAGACTGA
The total molecular weight (MW) of single stranded DNA molecules, such as synthetic
oligonucleotides, can be determined by adding the molecular weight of individual nucleotides
( a, t, g and c) using the formula
Write a C program to find the molecular weight of small stranded DNA molecule after
checking for the phosphorylated or dephosphorylated type of it.
#include<stdio.h>
#include<ctype.h>
#include<string.h>
main()
{
char dna[SIZE],ptype;
int i,len,na,nt,ng,nc;
float a_wt=335.2,t_wt=326.2,g_wt=351.2,c_wt=311.2;
float pval,dna_ml_wt;
127
printf("Phosphorylated [p] or dephosphorylated [d]:
");
scanf("%c",&ptype);
len=strlen(dna);
na=nt=ng=nc=0;
for(i=0;i<len;i++)
{
switch(tolower(dna[i]))
{
case 'a' : na++;
break;
case 't' : nt++;
break;
case 'g' : ng++;
break;
case 'c' : nc++;
break;
default :
printf("\n Wrong base value");
printf("\nExiting");
return;
}
}
printf("\np-type = %c",ptype);
printf("\nThe given DNA sequence : %s ",dna);
printf("\nLength of DNA sequence : %d ",len);
printf("\nphospho value = %.3f",pval);
printf("\n\nDNA mol.wt. : %10.2f Daltons (or g/mol)
",dna_ml_wt);
128
Sample I/O
P-type = P
The given DNA sequence : atgcatgc
Length of DNA sequence : 8
Phospho value = 40.000
#include<stdio.h>
#include<string.h>
main ()
{
char *rs,str[100],mtf[100];
int count=0;
rs = strstr(str,mtf);
while(rs!=0)
{
count++;
rs = strstr(rs+1,mtf);
}
printf("Motif %s is found %d time ",mtf,count);
129
Sample I/O
#include<stdio.h>
#include<string.h>
main()
{
int i,l1,l2;
char a, seq1[100],seq2[100],aseq[100];
130
l1 = strlen(seq1);
l2 = strlen(seq2);
if( l1>l2)
for(i=l2;i<l1;i++)
aseq[i]='-';
else
for(i=l1;i<l2;i++)
aseq[i]='-';
aseq[i]='\0';
printf("\nSequence are\n");
printf("%s\n%s",seq1,seq2);
printf("\nAligned sequence is \n%s",aseq);
Sample I/O
Result1
Enter Two sequences
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGG
AGAGG
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGA
ATGCC
Sequences are
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGG
AGAGG
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGA
ATGCC
Aligned sequence is
---CT-----------C--G--G-----AG--G--C-CCC----CA-T-C--G--CC---A-AG-
A----
Result2
Enter Two sequences
CCTGGAGGGTGGCCCCAC
CTGCAGGAACTTCTTCTGGAAGACCTTC
Sequences are
CCTGGAGGGTGGCCCCAC
CTGCAGGAACTTCTTCTGGAAGACCTTC
Aligned sequence is
C-----G-----C--C------------
131
Sample Program 6.10
Write a C program to sort ‟n‟ names
/* sort n names
sort_names.c
*/
#include <stdio.h>
#include <string.h>
main()
{
char names[20][30],t[30];
int n,i,j;
printf("\nEnter %d names\n",n);
for(i=0; i<n; i++)
scanf("%s",names[i]);
for(i=0;i<n-1;i++)
for(j=0;j<n-(i+1);j++)
if( strcmp(names[j],names[j+1]) > 0)
{
strcpy(t,names[j]);
strcpy(names[j],names[j+1]);
strcpy(names[j+1],t);
}
132
Sample I/O
Enter 5 names
protein
dna
alanine
nucleotide
rna
#include<stdio.h>
#include<string.h>
#define M_ROW 20
#define M_COL 20
void print_seq_matrix(int,int);
void build_seq_matrix();
int verify_seq(char[]);
int seq[M_ROW][M_COL];
main()
{
build_seq_matrix();
}
133
void build_seq_matrix()
{
int i,j,rs;
int row,col;
char seq1[15],seq2[15];
rs=0;
printf("\nEnter the second DNA sequence\n");
scanf("%s",seq2);
rs = verify_seq(seq2);
if (!rs)
{
printf("\nInvalid sequence");
return;
}
j=strlen(seq2);
if(j>i)
{
col=j+1;
row=i+1;
}
else
{
col=i+1;
row=j+1;
}
134
// Fills up the rest of the matrix with 1
print_seq_matrix(row,col);
}
135
Sample I/O
Result1
Enter the first DNA sequence
agctcga
Result2
Enter the first DNA sequence
agctbcg
Invalid sequence
136
7
Pointers
The computer‟s memory is a sequential collection of storage cells. Each cell, commonly
known as a byte, has a number called address associated with it. Typically, the addresses are
numbered consecutively, starting from zero. The last address depends on the memory size. A
computer system having 64K memory will have its last address as 65,535.
This statement instructs the system to find a location for the integer variable seq_length and
puts the value 120 in that location. Consider the system chooses the address location 4800 for
seq_length. This can be represented by the following figure Fig.7.1
Seq_length Variable
4800 address
120 Value
A pointer is a special type of variable which points the location of another variable. In
general, variables hold the values that one stored in them. In contrast, pointer variables hold
the address of the memory location of variables. Pointers can point to variables of any basic
data types, such as int, char, float, double and also derived data types like arrays, structures
and unions (the last two will be discussed later in the book).
137
7.2 ADVANTAGES OF POINTERS
For example
p= &seq_length; /*assigns the address 4800 */
For example
p = &seq_length;
Then *p contains the value 120.
Pointer variables contain addresses that belong to a separate data type, they must be declared
as pointers before use them. The declaration of a pointer variable is given as below:
data_type *ptr_name;
Example
int *a;
float *pH;
The pointer variable can be declared to any data type. It is important that the pointer variable
of integer type hold the memory location address of another integer variable only. Similarly
the pointer variable of float type holds the memory location address of another float variable
only. The data type of pointer variable must match with the type of the variables whose
memory addresses are to be stored onto them.
138
The following figure Fig 7.2. clearly shows that each ordinary variable has a memory
location address, where the actual value is stored. For example, the ordinary variable x has an
actual value 100 and the memory location 4800 and the ordinary variable y has an actual
value 200 and the memory location 4802. The address of x can be obtained by using &
operator and this can be stored into the pointer variable px by using a simple assignment
statement. Thus, the address of the variable is obtained by using & operator and stored into
the pointer variable py using the assignment statement. In C it can be done by the following
segment,
int x=100,y=200;
int *px, *py;
px = &x;
py = &y;
139
Sample Program 7.1
Write a C program to illustrate the concept of pointer
/* Illustrate the concept of pointer
p_molwt.*/
#include<stdio.h>
main()
{
float mol_wt, *ptr_mwt;
mol_wt = 40.56;
ptr_mwt = &mol_wt;
Sample I/O
Value of mol. wt using ordinary variable = 40.560
Value of mol.wt using pointer variable = 40.560
140
7.5 POINTERS AND ARRAYS
There is a close relationship between pointers and arrays. Array operation which can be
achieved by array subscripting can also be done with pointers. Pointer arithmetic is faster
than array indexing. If an array aacid[] is declared then aacid is the address of the first
element or &aacid[0] represents the address of its first element.
In the above statement, a sequence array aacid of length 50 characters and a char pointer
paacid are declared. The pointer can be initialized to the address of the first element of array
aacid[] in the following manner.
As the array pointer is variable, it is easy to point it to any element in the array and modify
the contents it points to.
The pointer aacid in the above statement points to the first element in the array. To access the
next element in the array, the pointer should be incremented so that it can point to the next
element for access. The two most common operations on pointers are increments and
decrements. When a pointer is increased or decreased, the pointer value increases or
decreases by the size of the data type, thus pointing to an element up or down the array.
The increment operator increases the value of the pointer by 1 which means that the pointer
points to the next storage location. In case of an array, it points to the next array element.
paacid++ value of the pointer is increased by the size of a char (1 byte) as each element in
the array is of char type. Suppose paacid is an integer pointer then,
paacid-- the value of the pointer is decreased by the size of a char ( 1 byte) as each element
in the array is of char type.
141
Write a C program to illustrate the pointer arithmetic operation
142
7.7. POINTER AND FUNCTIONS
Syntax
Example
143
7.8 FUNCTION CALL BY VALUE AND CALL BY REFERENCE
In a C language function, there are two ways that arguments can be passed to a subroutine.
The first is call by value. This method copies the value of an argument into the formal
parameter of the subroutine. In this case, changes made to the parameter have no effect on the
argument.
Call by reference is a second way of passing arguments to a subroutine. In this method, the
address of an argument is copied into the parameter. Inside the subroutine, the address is used
to access the actual argument used in the call. This means that changes made to the parameter
affect the argument. Call by reference can be created by passing a pointer to an argument,
instead of passing the argument itself.
144
Sample Program 7.5
Write a C program to demonstrate the function call by reference
145
146
8
Structures & Unions
An array is a collection of similar data items. C provides a new user defined data type called
structure which can group the similar or dissimilar data types. The variables in the structure
are called as members of the structure. So the structure may contain members of int, float,
char or double. A structure can also be nested. It can also contain an array of another
structure. The main advantage of structure is that one can handle the related data of a
particular entity as a record.
The structure is declared by the keyword struct. The general format of the structure is as
given below.
Example
struct organism
{
int temperature;
char salt[10];
float pH;
};
147
The structure variables are declared as
struct organism org1, org2, org3;
The structure can also be created using the keyword typedef. The general format of structure
using the keyword typedef is as follows:
If the structure is created using typedef, while declaring actual parameters the keyword
struct is not needed. The actual variables can be created as follows:
structure_name variable-1, variable-2, variable-3;
Example
typedef struct {
int temperature;
char salt[10];
float pH;
} organism;
The advantage of creating a structure using the keyword typedef is, to avoid typing the
keyword struct repeatedly while declaring the variables, passing the structure in a function,
allocating memory and so on.
In general, the structures are declared outside the main() function globally, so that they can
also be accessed by all other functions.
The members of the structure can also be initialized in the same format as an array.
Example,
The initialization for the above structure created using typedef is as follows:
Organism org = {100,”low”,4.3};
148
8.3 ACCESSING A STRUCTURE
The structure member can be accessed by using the member operator (.). The general syntax
of accessing structure member is as follows:
structure_variable.member_name;
For example org.temperatue, org.salt, org.pH denotes the three members of the organism
structure variable org.
149
8.4 ARRAY OF STRUCTURES
Arrays of structures are declared as the same way as the array are declared.
For example
struct organism org[20];
declares 20 elements of organism structure type to the variable org. So we can process data of
20 organisms of organism.
150
Sample Program 8.2
Write a C program to check the stability of n DNA sequences
151
152
153
8.5 NESTED STRUCTURES
A structure can contain other structures that have their own members.
Example
typedef struct
{
int concentration;
float absorbtion;
}absorbance;
typedef struct
{
absorbance r1;
absorbance r2;
absorbance r3;
absorbance r4;
}DNA_abs_260;
154
Sample program 8.3
Write a C program to check the DNA purification. Purification is identified as follows
155
156
8.6 UNIONS
A union is similar to structure data type which can group different data type. The main
difference of union compared to structure data type is it can hold only one data for the data
member at a given time whereas the structure can hold data for all the data members. The
structure allocate the memory for all the data members separately. But union allocate only
one memory space and all the members share the same memory location. The size of the
union is defined by the maximum size of the data member in the union.
157
Sample program 8.4
Write a C program to illustrate the use of union
158
9
File Processing in C
The formatted input and output functions such as scanf() and printf() used for read and write
data from the console such as keyboard and monitor. These methods are fine if the data is
small. But many real life problems involves large amount of data, such as read 100
sequences, sort 100 student names, etc. In this situation console I/O operations subject to two
major problems such as
time consuming and the input and output data is not preserved. Once the program terminates
there is a loss of entire data. So it is necessary to store the data and read whenever it is
necessary. This can be achieved by using the concept file. A file is a place on the disk where
a group of related data is stored. By using files the data can be stored permanently. C
provides a set of commands for processing the data in a file. By using this commands the data
stored in the file can be easily read and write to the file.
159
The function has two string arguments. The first one is the name of the file to open and the
second one specifies the purpose (file-mode) of opening the file. The most important mode
for file processing is, “w” for open the file for write, “r” for open the file for read and “a”
for open the file for appends data.
Example
FILE *fin, *fout;
fin = fopen(“dna.seq”,”r”);
fout = fopen(“rna.seq”,”w”);
The first statement open the file dna.seq for read process using fin and the second statement
open the file rna.seq for write process by using fout.
Example
fclose (fin);
fclose (fout);
160
9.3.1 Character Input and Output
The C function fgetc() is used to read a single character at a time from a file and the function
fputc() is used to write a single character into the file.
Syntax
Example
fputc (ch,fout);
ch = fgetc(fin);
The first statement writes a character to the file using fout pointer and the second statement
reads a single character from the file and store it in a ch variable using fin pointer.
Example
char line[25] = “agctagctagctagcgagctct”;
fputs(line,fout);
fgets(line, 25, fin);
The first statement writes a string into the file using the fout file-pointer and the second
statement reads the string from the file and store it into the variable line, using the fin file-
pointer.
The first statement reads the formatted data from the file using the file pointer in the format
specified by the control string and assign the values to the variables var-1,var-2, …. , var-n.
The second statement writes the data in the variables var-1, var-2, var-3, … var-n into a file
using the file pointer in the format specified by the control string.
Example
161
char ptype[] = “DNA”, pt[10];
int length = 120,l;
fprintf ( fout, “%s %d”, ptype, length);
fscanf( fin, “%s %d”, pt, l);
162
Sample Program 9.2
Write a C program to convert the DNA sequences into RNA sequences using FILE
163
164
Sample Program 9.4
Write a C program to access the formatted data from the file
165
References
166
View publication stats