What is the main difference between the Poisson and binomial prob distribution?
Binomial DistributionPoisson DistributionIt is biparametric, i.e. it has 2 parameters n and p.It is uniparametric, i.e. it has only 1 parameter m.The number of attempts are fixed.The number of attempts are unlimited.The probability of success is constant.The probability of success is extremely small.There are only two possible outcomes-Success or failure.There are unlimited possible outcomes.Mean>VarianceMean=Variance A typist makes on average 2 mistakes per page. What is the probability of a particular page having no errors on it?
We have an average rate here: lambda = 2 errors per page. We don't have an exact probability (e.g. something like "there is a probability of 1/2 that a page contains errors"). Hence, Poisson distribution. (lambda t) = (2 errors per page * 1 page) = 2. Hence P0 = 2^0/0! * exp(-2) = 0.135. Again, average rate given: lambda = 0.5 crashes/day. Hence, Poisson. (lambda t) = (0.5 per day * 7 days) = 3.5/week and n = 2. P2 = (3.5)^2/2! * exp (-3.5) = 0.185. Here we are given a definite probability, in this case, of defective components, p = 0.1 and hence q = 0.9 = Prob. not defective. Hence, Binomial, with n = 20. Expand (q + p)^20 to get q^20 + 20 q^19 p + 20(20-1)/2! q^18 p^2 + ... 0 1 2 No. faulty So P(2) = 20(20-1)/2! q^18 p^2 = 0.285. We have a probability of something being true and the same thing not being true; in this case, an ic being faulty. Hence, Binomial distribution. p = Prob. faulty = 0.02, q = Prob. not faulty = 0.98. n = 10. Expand (q + p)^10 to get q^10 + 10 q^9 p + 10(10-1)/2! q^8 p^2 + ... 0 1 2 No. of faulty ics. So, Prob of a box containing 2 faulty ics P2 = 10(10-1)/2! q^8 p^2 = 0.015. Here we have an average rate of faults occurring: 8 per house. Hence, Poisson, with (lambda t) = (8 faults/house * 1 house) = 8. [1 house because we're only buying one new house.] n = 1 too, so P1 = 8^1/1! * exp(-8) = 0.0027. Here too we have a probability of brass (1/3) and of not brass --- i.e. steel --- which is 2/3. Hence, use the Binomial distribution with p = 1/3, q = 2/3 and n = 4 to get (p + q)^4 = p^4 + 4 p^3 q + 6 p^2 q^2 + 4 p q^3 + q^4 4 3 2 1 0 No. of brass so P(0) = (2/3)^4 = 0.197, P(1) = 4 (1/3)(2/3)^3 = 0.395, P(2) = 6 (1/3)^2 (2/3)^2 = 0.296, P(3) = 4 (1/3)^3 (2/3) = 0.099 and P(4) = 0.012. The Binomial and Poisson distributions are similar, but they are different. Also, the fact that they are both discrete does not mean that they are the same. The Geometric distribution and one form of the Uniform distribution are also discrete, but they are very different from both the Binomial and Poisson distributions. Show
The difference between the two is that while both measure the number of certain random events (or "successes") within a certain frame, the Binomial is based on discrete events, while the Poisson is based on continuous events. That is, with a binomial distribution you have a certain number, $n$, of "attempts," each of which has probability of success $p$. With a Poisson distribution, you essentially have infinite attempts, with infinitesimal chance of success. That is, given a Binomial distribution with some $n,p$, if you let $n\rightarrow\infty$ and $p\rightarrow0$ in such a way that $np\rightarrow\lambda$, then that distribution approaches a Poisson distribution with parameter $\lambda$. Because of this limiting effect, Poisson distributions are used to model occurences of events that could happen a very large number of times, but happen rarely. That is, they are used in situations that would be more properly represented by a Binomial distribution with a very large $n$ and small $p$, especially when the exact values of $n$ and $p$ are unknown. (Historically, the number of wrongful criminal convictions in a country) R is a programming language and software environment for statistical analysis, graphics representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team. The core of R is an interpreted computer language which allows branching and looping as well as modular programming using functions. R allows integration with the procedures written in the C, C++, .Net, Python or FORTRAN languages for efficiency. R is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems like Linux, Windows and Mac. R is free software distributed under a GNU-style copy left, and an official part of the GNU project called GNU S. Evolution of RR was initially written by Ross Ihaka and Robert Gentleman at the Department of Statistics of the University of Auckland in Auckland, New Zealand. R made its first appearance in 1993.
Features of RAs stated earlier, R is a programming language and software environment for statistical analysis, graphics representation and reporting. The following are the important features of R −
As a conclusion, R is world’s most widely used statistics programming language. It's the # 1 choice of data scientists and supported by a vibrant and talented community of contributors. R is taught in universities and deployed in mission critical business applications. This tutorial will teach you R programming along with suitable examples in simple and easy steps. Local Environment SetupIf you are still willing to set up your environment for R, you can follow the steps given below. Windows InstallationYou can download the Windows installer version of R from R-3.2.2 for Windows (32/64 bit) and save it in a local directory. As it is a Windows installer (.exe) with a name "R-version-win.exe". You can just double click and run the installer accepting the default settings. If your Windows is 32-bit version, it installs the 32-bit version. But if your windows is 64-bit, then it installs both the 32-bit and 64-bit versions. After installation you can locate the icon to run the Program in a directory structure "R\R3.2.2\bin\i386\Rgui.exe" under the Windows Program Files. Clicking this icon brings up the R-GUI which is the R console to do R Programming. Linux InstallationR is available as a binary for many versions of Linux at the location R Binaries. The instruction to install Linux varies from flavor to flavor. These steps are mentioned under each type of Linux version in the mentioned link. However, if you are in a hurry, then you can use yum command to install R as follows − $ yum install R Above command will install core functionality of R programming along with standard packages, still you need additional package, then you can launch R prompt as follows − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > Now you can use install command at R prompt to install the required package. For example, the following command will install plotrix package which is required for 3D charts. > install.packages("plotrix") As a convention, we will start learning R programming by writing a "Hello, World!" program. Depending on the needs, you can program either at R command prompt or you can use an R script file to write your program. Let's check both one by one. R Command PromptOnce you have R environment setup, then it’s easy to start your R command prompt by just typing the following command at your command prompt − $ R This will launch R interpreter and you will get a prompt > where you can start typing your program as follows − > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!" Here first statement defines a string variable myString, where we assign a string "Hello, World!" and then next statement print() is being used to print the value stored in variable myString. R Script FileUsually, you will do your programming by writing your programs in script files and then you execute those scripts at your command prompt with the help of R interpreter called Rscript. So let's start with writing following code in a text file called test.R as under − # My first program in R Programming myString <- "Hello, World!" print ( myString) Save the above code in a file test.R and execute it at Linux command prompt as given below. Even if you are using Windows or other system, syntax will remain same. $ Rscript test.R When we run the above program, it produces the following result. [1] "Hello, World!" CommentsComments are like helping text in your R program and they are ignored by the interpreter while executing your actual program. Single comment is written using # in the beginning of the statement as follows − # My first program in R Programming R does not support multi-line comments but you can perform a trick which is something as follows − if(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString) [1] "Hello, World!" Though above comments will be executed by R interpreter, they will not interfere with your actual program. You should put such comments inside, either single or double quote. Generally, while doing programming in any programming language, you need to use various variables to store various information. Variables are nothing but reserved memory locations to store values. This means that, when you create a variable you reserve some space in memory. You may like to store information of various data types like character, wide character, integer, floating point, double floating point, Boolean etc. Based on the data type of a variable, the operating system allocates memory and decides what can be stored in the reserved memory. In contrast to other programming languages like C and java in R, the variables are not declared as some data type. The variables are assigned with R-Objects and the data type of the R-object becomes the data type of the variable. There are many types of R-objects. The frequently used ones are −
The simplest of these objects is the vector object and there are six data types of these atomic vectors, also termed as six classes of vectors. The other R-Objects are built upon the atomic vectors. Data TypeExampleVerifyLogicalTRUE, FALSE$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >1 it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >2Numeric12.3, 5, 999 $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >3 it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >4Integer2L, 34L, 0L $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >5 it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >6Complex3 + 2i $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >7 it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >8Character'a' , '"good", "TRUE", '23.4' $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >9 it produces the following result − > install.packages("plotrix")0Raw"Hello" is stored as 48 65 6c 6c 6f > install.packages("plotrix")1 it produces the following result − > install.packages("plotrix")2 In R programming, the very basic data types are the R-objects called vectors which hold elements of different classes as shown above. Please note in R the number of classes is not confined to only the above six types. For example, we can use many atomic vectors and create an array whose class will become array. VectorsWhen you want to create vector with more than one element, you should use c() function which means to combine the elements into a vector. > install.packages("plotrix")3 When we execute the above code, it produces the following result − > install.packages("plotrix")4 ListsA list is an R-object which can contain many different types of elements inside it like vectors, functions and even another list inside it. > install.packages("plotrix")5 When we execute the above code, it produces the following result − > install.packages("plotrix")6 MatricesA matrix is a two-dimensional rectangular data set. It can be created using a vector input to the matrix function. > install.packages("plotrix")7 When we execute the above code, it produces the following result − > install.packages("plotrix")8 ArraysWhile matrices are confined to two dimensions, arrays can be of any number of dimensions. The array function takes a dim attribute which creates the required number of dimension. In the below example we create an array with two elements which are 3x3 matrices each. > install.packages("plotrix")9 When we execute the above code, it produces the following result − $ R0 FactorsFactors are the r-objects which are created using a vector. It stores the vector along with the distinct values of the elements in the vector as labels. The labels are always character irrespective of whether it is numeric or character or Boolean etc. in the input vector. They are useful in statistical modeling. Factors are created using the factor() function. The nlevels functions gives the count of levels. $ R1 When we execute the above code, it produces the following result − $ R2 Data FramesData frames are tabular data objects. Unlike a matrix in data frame each column can contain different modes of data. The first column can be numeric while the second column can be character and third column can be logical. It is a list of vectors of equal length. Data Frames are created using the data.frame() function. $ R3 When we execute the above code, it produces the following result − $ R4 A variable provides us with named storage that our programs can manipulate. A variable in R can store an atomic vector, group of atomic vectors or a combination of many Robjects. A valid variable name consists of letters, numbers and the dot or underline characters. The variable name starts with a letter or the dot not followed by a number. Variable NameValidityReasonvar_name2.validHas letters, numbers, dot and underscorevar_name%InvalidHas the character '%'. Only dot(.) and underscore allowed.2var_nameinvalidStarts with a number.var_name, var.name validCan start with a dot(.) but the dot(.)should not be followed by a number..2var_nameinvalidThe starting dot is followed by a number making it invalid._var_nameinvalidStarts with _ which is not validVariable AssignmentThe variables can be assigned values using leftward, rightward and equal to operator. The values of the variables can be printed using print() or cat() function. The cat() function combines multiple items into a continuous print output. $ R5 When we execute the above code, it produces the following result − $ R6 Note − The vector c(TRUE,1) has a mix of logical and numeric class. So logical class is coerced to numeric class making TRUE as 1. Data Type of a VariableIn R, a variable itself is not declared of any data type, rather it gets the data type of the R - object assigned to it. So R is called a dynamically typed language, which means that we can change a variable’s data type of the same variable again and again when using it in a program. $ R7 When we execute the above code, it produces the following result − $ R8 Finding VariablesTo know all the variables currently available in the workspace we use the ls() function. Also the ls() function can use patterns to match the variable names. $ R9 When we execute the above code, it produces the following result − > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"0 Note − It is a sample output depending on what variables are declared in your environment. The ls() function can use patterns to match the variable names. > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"1 When we execute the above code, it produces the following result − > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"2 The variables starting with dot(.) are hidden, they can be listed using "all.names = TRUE" argument to ls() function. > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"3 When we execute the above code, it produces the following result − > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"4 Deleting VariablesVariables can be deleted by using the rm() function. Below we delete the variable var.3. On printing the value of the variable error is thrown. > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"5 When we execute the above code, it produces the following result − > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"6 All the variables can be deleted by using the rm() and ls() function together. > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"7 When we execute the above code, it produces the following result − > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"8 An operator is a symbol that tells the compiler to perform specific mathematical or logical manipulations. R language is rich in built-in operators and provides following types of operators. Types of OperatorsWe have the following types of operators in R programming −
Arithmetic OperatorsFollowing table shows the arithmetic operators supported by R language. The operators act on each element of the vector. OperatorDescriptionExample+Adds two vectors> myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"9 it produces the following result − # My first program in R Programming myString <- "Hello, World!" print ( myString)0−Subtracts second vector from the first # My first program in R Programming myString <- "Hello, World!" print ( myString)1 it produces the following result − # My first program in R Programming myString <- "Hello, World!" print ( myString)2*Multiplies both vectors # My first program in R Programming myString <- "Hello, World!" print ( myString)3 it produces the following result − # My first program in R Programming myString <- "Hello, World!" print ( myString)4/Divide the first vector with the second # My first program in R Programming myString <- "Hello, World!" print ( myString)5 When we execute the above code, it produces the following result − # My first program in R Programming myString <- "Hello, World!" print ( myString)6%%Give the remainder of the first vector with the second # My first program in R Programming myString <- "Hello, World!" print ( myString)7 it produces the following result − # My first program in R Programming myString <- "Hello, World!" print ( myString)8%/%The result of division of first vector with second (quotient) # My first program in R Programming myString <- "Hello, World!" print ( myString)9 it produces the following result − $ Rscript test.R0^The first vector raised to the exponent of second vector $ Rscript test.R1 it produces the following result − $ Rscript test.R2 Relational OperatorsFollowing table shows the relational operators supported by R language. Each element of the first vector is compared with the corresponding element of the second vector. The result of comparison is a Boolean value. OperatorDescriptionExample>Checks if each element of the first vector is greater than the corresponding element of the second vector.$ Rscript test.R3 it produces the following result − $ Rscript test.R4 it produces the following result − $ Rscript test.R6==Checks if each element of the first vector is equal to the corresponding element of the second vector. $ Rscript test.R7 it produces the following result − $ Rscript test.R8<=Checks if each element of the first vector is less than or equal to the corresponding element of the second vector. $ Rscript test.R9 it produces the following result − [1] "Hello, World!"0>=Checks if each element of the first vector is greater than or equal to the corresponding element of the second vector. [1] "Hello, World!"1 it produces the following result − [1] "Hello, World!"2!=Checks if each element of the first vector is unequal to the corresponding element of the second vector. [1] "Hello, World!"3 it produces the following result − [1] "Hello, World!"4 Logical OperatorsFollowing table shows the logical operators supported by R language. It is applicable only to vectors of type logical, numeric or complex. All numbers greater than 1 are considered as logical value TRUE. Each element of the first vector is compared with the corresponding element of the second vector. The result of comparison is a Boolean value. OperatorDescriptionExample&It is called Element-wise Logical AND operator. It combines each element of the first vector with the corresponding element of the second vector and gives a output TRUE if both the elements are TRUE.[1] "Hello, World!"5 it produces the following result − [1] "Hello, World!"6|It is called Element-wise Logical OR operator. It combines each element of the first vector with the corresponding element of the second vector and gives a output TRUE if one the elements is TRUE. [1] "Hello, World!"7 it produces the following result − [1] "Hello, World!"0!It is called Logical NOT operator. Takes each element of the vector and gives the opposite logical value. [1] "Hello, World!"9 it produces the following result − $ Rscript test.R4 The logical operator && and || considers only the first element of the vectors and give a vector of single element as output. OperatorDescriptionExample&&Called Logical AND operator. Takes first element of both the vectors and gives the TRUE only if both are TRUE.# My first program in R Programming1 it produces the following result − # My first program in R Programming2||Called Logical OR operator. Takes first element of both the vectors and gives the TRUE if one of them is TRUE. # My first program in R Programming3 it produces the following result − # My first program in R Programming4 Assignment OperatorsThese operators are used to assign values to vectors. OperatorDescriptionExample<− or = or <<− Called Left Assignment# My first program in R Programming5 it produces the following result − # My first program in R Programming6 -> or ->> Called Right Assignment# My first program in R Programming7 it produces the following result − # My first program in R Programming8 Miscellaneous OperatorsThese operators are used to for specific purpose and not general mathematical or logical computation. OperatorDescriptionExample:Colon operator. It creates the series of numbers in sequence for a vector.# My first program in R Programming9 it produces the following result − if(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString)0%in%This operator is used to identify if an element belongs to a vector. if(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString)1 it produces the following result − if(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString)2%*%This operator is used to multiply a matrix with its transpose. if(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString)3 it produces the following result − if(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString)4 Decision making structures require the programmer to specify one or more conditions to be evaluated or tested by the program, along with a statement or statements to be executed if the condition is determined to be true, and optionally, other statements to be executed if the condition is determined to be false. Following is the general form of a typical decision making structure found in most of the programming languages − R provides the following types of decision making statements. Click the following links to check their detail. Sr.No.Statement & Description1if statementAn if statement consists of a Boolean expression followed by one or more statements. 2if...else statementAn if statement can be followed by an optional else statement, which executes when the Boolean expression is false. 3switch statementA switch statement allows a variable to be tested for equality against a list of values. There may be a situation when you need to execute a block of code several number of times. In general, statements are executed sequentially. The first statement in a function is executed first, followed by the second, and so on. Programming languages provide various control structures that allow for more complicated execution paths. A loop statement allows us to execute a statement or group of statements multiple times and the following is the general form of a loop statement in most of the programming languages − R programming language provides the following kinds of loop to handle looping requirements. Click the following links to check their detail. Sr.No.Loop Type & Description1repeat loopExecutes a sequence of statements multiple times and abbreviates the code that manages the loop variable. 2while loopRepeats a statement or group of statements while a given condition is true. It tests the condition before executing the loop body. 3for loopLike a while statement, except that it tests the condition at the end of the loop body. Loop Control StatementsLoop control statements change execution from its normal sequence. When execution leaves a scope, all automatic objects that were created in that scope are destroyed. R supports the following control statements. Click the following links to check their detail. Sr.No.Control Statement & Description1break statementTerminates the loop statement and transfers execution to the statement immediately following the loop. 2Next statementThe next statement simulates the behavior of R switch. A function is a set of statements organized together to perform a specific task. R has a large number of in-built functions and the user can create their own functions. In R, a function is an object so the R interpreter is able to pass control to the function, along with arguments that may be necessary for the function to accomplish the actions. The function in turn performs its task and returns control to the interpreter as well as any result which may be stored in other objects. Function DefinitionAn R function is created by using the keyword function. The basic syntax of an R function definition is as follows − if(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString)5 Function ComponentsThe different parts of a function are −
R has many in-built functions which can be directly called in the program without defining them first. We can also create and use our own functions referred as user defined functions. Built-in FunctionSimple examples of in-built functions are seq(), mean(), max(), sum(x) and paste(...) etc. They are directly called by user written programs. You can refer most widely used R functions. if(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString)6 When we execute the above code, it produces the following result − if(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString)7 User-defined FunctionWe can create user-defined functions in R. They are specific to what a user wants and once created they can be used like the built-in functions. Below is an example of how a function is created and used. if(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString)8 Calling a Functionif(FALSE) { "This is a demo for multi-line comments and it should be put inside either a single OR double quote" } myString <- "Hello, World!" print ( myString)9 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >00 Calling a Function without an Argument$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >01 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >02 Calling a Function with Argument Values (by position and by name)The arguments to a function call can be supplied in the same sequence as defined in the function or they can be supplied in a different sequence but assigned to the names of the arguments. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >03 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >04 Calling a Function with Default ArgumentWe can define the value of the arguments in the function definition and call the function without supplying any argument to get the default result. But we can also call such functions by supplying new values of the argument and get non default result. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >05 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >06 Lazy Evaluation of FunctionArguments to functions are evaluated lazily, which means so they are evaluated only when needed by the function body. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >07 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >08 Any value written within a pair of single quote or double quotes in R is treated as a string. Internally R stores every string within double quotes, even when you create them with single quote. Rules Applied in String Construction
Examples of Valid StringsFollowing examples clarify the rules about creating a string in R. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >09 When the above code is run we get the following output − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >10 Examples of Invalid Strings$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >11 When we run the script it fails giving below results. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >12 String ManipulationConcatenating Strings - paste() functionMany strings in R are combined using the paste() function. It can take any number of arguments to be combined together. SyntaxThe basic syntax for paste function is − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >13 Following is the description of the parameters used −
Example$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >14 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >15 Formatting numbers & strings - format() functionNumbers and strings can be formatted to a specific style using format() function. SyntaxThe basic syntax for format function is − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >16 Following is the description of the parameters used −
Example$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >17 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >18 Counting number of characters in a string - nchar() functionThis function counts the number of characters including spaces in a string. SyntaxThe basic syntax for nchar() function is − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >19 Following is the description of the parameters used − Example$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >20 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >21 Changing the case - toupper() & tolower() functionsThese functions change the case of characters of a string. SyntaxThe basic syntax for toupper() & tolower() function is − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >22 Following is the description of the parameters used − Example$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >23 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >24 Extracting parts of a string - substring() functionThis function extracts parts of a String. SyntaxThe basic syntax for substring() function is − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >25 Following is the description of the parameters used −
Example$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >26 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >27 Vectors are the most basic R data objects and there are six types of atomic vectors. They are logical, integer, double, complex, character and raw. Vector CreationSingle Element VectorEven when you write just one value in R, it becomes a vector of length 1 and belongs to one of the above vector types. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >28 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >29 Multiple Elements VectorUsing colon operator with numeric data $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >30 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >31 Using sequence (Seq.) operator $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >32 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >33 Using the c() function The non-character values are coerced to character type if one of the elements is a character. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >34 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >35 Accessing Vector ElementsElements of a Vector are accessed using indexing. The [ ] brackets are used for indexing. Indexing starts with position 1. Giving a negative value in the index drops that element from result.TRUE, FALSE or 0 and 1 can also be used for indexing. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >36 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >37 Vector ManipulationVector arithmeticTwo vectors of same length can be added, subtracted, multiplied or divided giving the result as a vector output. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >38 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >39 Vector Element RecyclingIf we apply arithmetic operations to two vectors of unequal length, then the elements of the shorter vector are recycled to complete the operations. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >40 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >41 Vector Element SortingElements in a vector can be sorted using the sort() function. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >42 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >43 Lists are the R objects which contain elements of different types like − numbers, strings, vectors and another list inside it. A list can also contain a matrix or a function as its elements. List is created using list() function. Creating a ListFollowing is an example to create a list containing strings, numbers, vectors and a logical values. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >44 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >45 Naming List ElementsThe list elements can be given names and they can be accessed using these names. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >46 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >47 Accessing List ElementsElements of the list can be accessed by the index of the element in the list. In case of named lists it can also be accessed using the names. We continue to use the list in the above example − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >48 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >49 Manipulating List ElementsWe can add, delete and update list elements as shown below. We can add and delete elements only at the end of a list. But we can update any element. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >50 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >51 Merging ListsYou can merge many lists into one list by placing all the lists inside one list() function. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >52 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >53 Converting List to VectorA list can be converted to a vector so that the elements of the vector can be used for further manipulation. All the arithmetic operations on vectors can be applied after the list is converted into vectors. To do this conversion, we use the unlist() function. It takes the list as input and produces a vector. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >54 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >55 Matrices are the R objects in which the elements are arranged in a two-dimensional rectangular layout. They contain elements of the same atomic types. Though we can create a matrix containing only characters or only logical values, they are not of much use. We use matrices containing numeric elements to be used in mathematical calculations. A Matrix is created using the matrix() function. SyntaxThe basic syntax for creating a matrix in R is − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >56 Following is the description of the parameters used −
ExampleCreate a matrix taking a vector of numbers as input. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >57 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >58 Accessing Elements of a MatrixElements of a matrix can be accessed by using the column and row index of the element. We consider the matrix P above to find the specific elements below. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >59 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >60 Matrix ComputationsVarious mathematical operations are performed on the matrices using the R operators. The result of the operation is also a matrix. The dimensions (number of rows and columns) should be same for the matrices involved in the operation. Matrix Addition & Subtraction$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >61 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >62 Matrix Multiplication & Division$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >63 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >64 Arrays are the R data objects which can store data in more than two dimensions. For example − If we create an array of dimension (2, 3, 4) then it creates 4 rectangular matrices each with 2 rows and 3 columns. Arrays can store only data type. An array is created using the array() function. It takes vectors as input and uses the values in the dim parameter to create an array. ExampleThe following example creates an array of two 3x3 matrices each with 3 rows and 3 columns. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >65 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >66 Naming Columns and RowsWe can give names to the rows, columns and matrices in the array by using the dimnames parameter. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >67 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >68 Accessing Array Elements$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >69 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >70 Manipulating Array ElementsAs array is made up matrices in multiple dimensions, the operations on elements of array are carried out by accessing elements of the matrices. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >71 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >72 Calculations Across Array ElementsWe can do calculations across the elements in an array using the apply() function. Syntax$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >73 Following is the description of the parameters used −
ExampleWe use the apply() function below to calculate the sum of the elements in the rows of an array across all the matrices. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >74 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >75 Factors are the data objects which are used to categorize the data and store it as levels. They can store both strings and integers. They are useful in the columns which have a limited number of unique values. Like "Male, "Female" and True, False etc. They are useful in data analysis for statistical modeling. Factors are created using the factor () function by taking a vector as input. Example$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >76 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >77 Factors in Data FrameOn creating any data frame with a column of text data, R treats the text column as categorical data and creates factors on it. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >78 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >79 Changing the Order of LevelsThe order of the levels in a factor can be changed by applying the factor function again with new order of the levels. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >80 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >81 Generating Factor LevelsWe can generate factor levels by using the gl() function. It takes two integers as input which indicates how many levels and how many times each level. Syntax$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >82 Following is the description of the parameters used −
Example$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >83 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >84 A data frame is a table or a two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values from each column. Following are the characteristics of a data frame.
Create Data Frame$ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >85 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >86 Get the Structure of the Data FrameThe structure of the data frame can be seen by using str() function. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >87 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >88 Summary of Data in Data FrameThe statistical summary and nature of the data can be obtained by applying summary() function. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >89 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >90 Extract Data from Data FrameExtract specific column from a data frame using column name. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >91 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >92 Extract the first two rows and then all columns $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >93 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >94 Extract 3rd and 5th row with 2nd and 4th column $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >95 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >96 Expand Data FrameA data frame can be expanded by adding columns and rows. Add ColumnJust add the column vector using a new column name. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >97 When we execute the above code, it produces the following result − $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >98 Add RowTo add more rows permanently to an existing data frame, we need to bring in the new rows in the same structure as the existing data frame and use the rbind() function. In the example below we create a data frame with new rows and merge it with the existing data frame to create the final data frame. $ R R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. >99 When we execute the above code, it produces the following result − > install.packages("plotrix")00 R packages are a collection of R functions, complied code and sample data. They are stored under a directory called "library" in the R environment. By default, R installs a set of packages during installation. More packages are added later, when they are needed for some specific purpose. When we start the R console, only the default packages are available by default. Other packages which are already installed have to be loaded explicitly to be used by the R program that is going to use them. All the packages available in R language are listed at R Packages. Below is a list of commands to be used to check, verify and use the R packages. Check Available R PackagesGet library locations containing R packages > install.packages("plotrix")01 When we execute the above code, it produces the following result. It may vary depending on the local settings of your pc. > install.packages("plotrix")02 Get the list of all the packages installed> install.packages("plotrix")03 When we execute the above code, it produces the following result. It may vary depending on the local settings of your pc. > install.packages("plotrix")04 Get all packages currently loaded in the R environment > install.packages("plotrix")05 When we execute the above code, it produces the following result. It may vary depending on the local settings of your pc. > install.packages("plotrix")06 Install a New PackageThere are two ways to add new R packages. One is installing directly from the CRAN directory and another is downloading the package to your local system and installing it manually. Install directly from CRANThe following command gets the packages directly from CRAN webpage and installs the package in the R environment. You may be prompted to choose a nearest mirror. Choose the one appropriate to your location. > install.packages("plotrix")07 Install package manuallyGo to the link R Packages to download the package needed. Save the package as a .zip file in a suitable location in the local system. Now you can run the following command to install this package in the R environment. > install.packages("plotrix")08 Load Package to LibraryBefore a package can be used in the code, it must be loaded to the current R environment. You also need to load a package that is already installed previously but not available in the current environment. A package is loaded using the following command − > install.packages("plotrix")09 Data Reshaping in R is about changing the way data is organized into rows and columns. Most of the time data processing in R is done by taking the input data as a data frame. It is easy to extract data from the rows and columns of a data frame but there are situations when we need the data frame in a format that is different from format in which we received it. R has many functions to split, merge and change the rows to columns and vice-versa in a data frame. Joining Columns and Rows in a Data FrameWe can join multiple vectors to create a data frame using the cbind()function. Also we can merge two data frames using rbind() function. > install.packages("plotrix")10 When we execute the above code, it produces the following result − > install.packages("plotrix")11 Merging Data FramesWe can merge two data frames by using the merge() function. The data frames must have same column names on which the merging happens. In the example below, we consider the data sets about Diabetes in Pima Indian Women available in the library names "MASS". we merge the two data sets based on the values of blood pressure("bp") and body mass index("bmi"). On choosing these two columns for merging, the records where values of these two variables match in both data sets are combined together to form a single data frame. > install.packages("plotrix")12 When we execute the above code, it produces the following result − > install.packages("plotrix")13 Melting and CastingOne of the most interesting aspects of R programming is about changing the shape of the data in multiple steps to get a desired shape. The functions used to do this are called melt() and cast(). We consider the dataset called ships present in the library called "MASS". > install.packages("plotrix")14 When we execute the above code, it produces the following result − > install.packages("plotrix")15 Melt the DataNow we melt the data to organize it, converting all columns other than type and year into multiple rows. > install.packages("plotrix")16 When we execute the above code, it produces the following result − > install.packages("plotrix")17 Cast the Molten DataWe can cast the molten data into a new form where the aggregate of each type of ship for each year is created. It is done using the cast() function. > install.packages("plotrix")18 When we execute the above code, it produces the following result − > install.packages("plotrix")19 In R, we can read data from files stored outside the R environment. We can also write data into files which will be stored and accessed by the operating system. R can read and write into various file formats like csv, excel, xml etc. In this chapter we will learn to read data from a csv file and then write data into a csv file. The file should be present in current working directory so that R can read it. Of course we can also set our own directory and read files from there. Getting and Setting the Working DirectoryYou can check which directory the R workspace is pointing to using the getwd() function. You can also set a new working directory using setwd()function. > install.packages("plotrix")20 When we execute the above code, it produces the following result − > install.packages("plotrix")21 This result depends on your OS and your current directory where you are working. Input as CSV FileThe csv file is a text file in which the values in the columns are separated by a comma. Let's consider the following data present in the file named input.csv. You can create this file using windows notepad by copying and pasting this data. Save the file as input.csv using the save As All files(*.*) option in notepad. > install.packages("plotrix")22 Reading a CSV FileFollowing is a simple example of read.csv() function to read a CSV file available in your current working directory − > install.packages("plotrix")23 When we execute the above code, it produces the following result − > install.packages("plotrix")24 Analyzing the CSV FileBy default the read.csv() function gives the output as a data frame. This can be easily checked as follows. Also we can check the number of columns and rows. > install.packages("plotrix")25 When we execute the above code, it produces the following result − > install.packages("plotrix")26 Once we read data in a data frame, we can apply all the functions applicable to data frames as explained in subsequent section. Get the maximum salary> install.packages("plotrix")27 When we execute the above code, it produces the following result − > install.packages("plotrix")28 Get the details of the person with max salaryWe can fetch rows meeting specific filter criteria similar to a SQL where clause. > install.packages("plotrix")29 When we execute the above code, it produces the following result − > install.packages("plotrix")30 Get all the people working in IT department> install.packages("plotrix")31 When we execute the above code, it produces the following result − > install.packages("plotrix")32 Get the persons in IT department whose salary is greater than 600> install.packages("plotrix")33 When we execute the above code, it produces the following result − > install.packages("plotrix")34 Get the people who joined on or after 2014> install.packages("plotrix")35 When we execute the above code, it produces the following result − > install.packages("plotrix")36 Writing into a CSV FileR can create csv file form existing data frame. The write.csv() function is used to create the csv file. This file gets created in the working directory. > install.packages("plotrix")37 When we execute the above code, it produces the following result − > install.packages("plotrix")38 Here the column X comes from the data set newper. This can be dropped using additional parameters while writing the file. > install.packages("plotrix")39 When we execute the above code, it produces the following result − > install.packages("plotrix")40 Microsoft Excel is the most widely used spreadsheet program which stores data in the .xls or .xlsx format. R can read directly from these files using some excel specific packages. Few such packages are - XLConnect, xlsx, gdata etc. We will be using xlsx package. R can also write into excel file using this package. Install xlsx PackageYou can use the following command in the R console to install the "xlsx" package. It may ask to install some additional packages on which this package is dependent. Follow the same command with required package name to install the additional packages. > install.packages("plotrix")41 Verify and Load the "xlsx" PackageUse the following command to verify and load the "xlsx" package. > install.packages("plotrix")42 When the script is run we get the following output. > install.packages("plotrix")43 Input as xlsx FileOpen Microsoft excel. Copy and paste the following data in the work sheet named as sheet1. > install.packages("plotrix")44 Also copy and paste the following data to another worksheet and rename this worksheet to "city". > install.packages("plotrix")45 Save the Excel file as "input.xlsx". You should save it in the current working directory of the R workspace. Reading the Excel FileThe input.xlsx is read by using the read.xlsx() function as shown below. The result is stored as a data frame in the R environment. > install.packages("plotrix")46 When we execute the above code, it produces the following result − > install.packages("plotrix")47 A binary file is a file that contains information stored only in form of bits and bytes.(0’s and 1’s). They are not human readable as the bytes in it translate to characters and symbols which contain many other non-printable characters. Attempting to read a binary file using any text editor will show characters like Ø and ð. The binary file has to be read by specific programs to be useable. For example, the binary file of a Microsoft Word program can be read to a human readable form only by the Word program. Which indicates that, besides the human readable text, there is a lot more information like formatting of characters and page numbers etc., which are also stored along with alphanumeric characters. And finally a binary file is a continuous sequence of bytes. The line break we see in a text file is a character joining first line to the next. Sometimes, the data generated by other programs are required to be processed by R as a binary file. Also R is required to create binary files which can be shared with other programs. R has two functions WriteBin() and readBin() to create and read binary files. Syntax> install.packages("plotrix")48 Following is the description of the parameters used −
ExampleWe consider the R inbuilt data "mtcars". First we create a csv file from it and convert it to a binary file and store it as a OS file. Next we read this binary file created into R. Writing the Binary FileWe read the data frame "mtcars" as a csv file and then write it as a binary file to the OS. > install.packages("plotrix")49 Reading the Binary FileThe binary file created above stores all the data as continuous bytes. So we will read it by choosing appropriate values of column names as well as the column values. > install.packages("plotrix")50 When we execute the above code, it produces the following result and chart − > install.packages("plotrix")51 As we can see, we got the original data back by reading the binary file in R. XML is a file format which shares both the file format and the data on the World Wide Web, intranets, and elsewhere using standard ASCII text. It stands for Extensible Markup Language (XML). Similar to HTML it contains markup tags. But unlike HTML where the markup tag describes structure of the page, in xml the markup tags describe the meaning of the data contained into he file. You can read a xml file in R using the "XML" package. This package can be installed using following command. > install.packages("plotrix")52 Input DataCreate a XMl file by copying the below data into a text editor like notepad. Save the file with a .xml extension and choosing the file type as all files(*.*). > install.packages("plotrix")53 Reading XML FileThe xml file is read by R using the function xmlParse(). It is stored as a list in R. > install.packages("plotrix")54 When we execute the above code, it produces the following result − > install.packages("plotrix")55 Get Number of Nodes Present in XML File> install.packages("plotrix")56 When we execute the above code, it produces the following result − > install.packages("plotrix")57 Details of the First NodeLet's look at the first record of the parsed file. It will give us an idea of the various elements present in the top level node. > install.packages("plotrix")58 When we execute the above code, it produces the following result − > install.packages("plotrix")59 Get Different Elements of a Node> install.packages("plotrix")60 When we execute the above code, it produces the following result − > install.packages("plotrix")61 XML to Data FrameTo handle the data effectively in large files we read the data in the xml file as a data frame. Then process the data frame for data analysis. > install.packages("plotrix")62 When we execute the above code, it produces the following result − > install.packages("plotrix")63 As the data is now available as a dataframe we can use data frame related function to read and manipulate the file. JSON file stores data as text in human-readable format. Json stands for JavaScript Object Notation. R can read JSON files using the rjson package. Install rjson PackageIn the R console, you can issue the following command to install the rjson package. > install.packages("plotrix")64 Input DataCreate a JSON file by copying the below data into a text editor like notepad. Save the file with a .json extension and choosing the file type as all files(*.*). > install.packages("plotrix")65 Read the JSON FileThe JSON file is read by R using the function from JSON(). It is stored as a list in R. > install.packages("plotrix")66 When we execute the above code, it produces the following result − > install.packages("plotrix")67 Convert JSON to a Data FrameWe can convert the extracted data above to a R data frame for further analysis using the as.data.frame() function. > install.packages("plotrix")68 When we execute the above code, it produces the following result − > install.packages("plotrix")24 Many websites provide data for consumption by its users. For example the World Health Organization(WHO) provides reports on health and medical information in the form of CSV, txt and XML files. Using R programs, we can programmatically extract specific data from such websites. Some packages in R which are used to scrap data form the web are − "RCurl",XML", and "stringr". They are used to connect to the URL’s, identify required links for the files and download them to the local environment. Install R PackagesThe following packages are required for processing the URL’s and links to the files. If they are not available in your R Environment, you can install them using following commands. > install.packages("plotrix")70 Input DataWe will visit the URL weather data and download the CSV files using R for the year 2015. ExampleWe will use the function getHTMLLinks() to gather the URLs of the files. Then we will use the function download.file() to save the files to the local system. As we will be applying the same code again and again for multiple files, we will create a function to be called multiple times. The filenames are passed as parameters in form of a R list object to this function. > install.packages("plotrix")71 Verify the File DownloadAfter running the above code, you can locate the following files in the current R working directory. > install.packages("plotrix")72 The data is Relational database systems are stored in a normalized format. So, to carry out statistical computing we will need very advanced and complex Sql queries. But R can connect easily to many relational databases like MySql, Oracle, Sql server etc. and fetch records from them as a data frame. Once the data is available in the R environment, it becomes a normal R data set and can be manipulated or analyzed using all the powerful packages and functions. In this tutorial we will be using MySql as our reference database for connecting to R. RMySQL PackageR has a built-in package named "RMySQL" which provides native connectivity between with MySql database. You can install this package in the R environment using the following command. > install.packages("plotrix")73 Connecting R to MySqlOnce the package is installed we create a connection object in R to connect to the database. It takes the username, password, database name and host name as input. > install.packages("plotrix")74 When we execute the above code, it produces the following result − > install.packages("plotrix")75 Querying the TablesWe can query the database tables in MySql using the function dbSendQuery(). The query gets executed in MySql and the result set is returned using the R fetch() function. Finally it is stored as a data frame in R. > install.packages("plotrix")76 When we execute the above code, it produces the following result − > install.packages("plotrix")77 Query with Filter ClauseWe can pass any valid select query to get the result. > install.packages("plotrix")78 When we execute the above code, it produces the following result − > install.packages("plotrix")79 Updating Rows in the TablesWe can update the rows in a Mysql table by passing the update query to the dbSendQuery() function. > install.packages("plotrix")80 After executing the above code we can see the table updated in the MySql Environment. Inserting Data into the Tables> install.packages("plotrix")81 After executing the above code we can see the row inserted into the table in the MySql Environment. Creating Tables in MySqlWe can create tables in the MySql using the function dbWriteTable(). It overwrites the table if it already exists and takes a data frame as input. > install.packages("plotrix")82 After executing the above code we can see the table created in the MySql Environment. Dropping Tables in MySqlWe can drop the tables in MySql database passing the drop table statement into the dbSendQuery() in the same way we used it for querying data from tables. > install.packages("plotrix")83 After executing the above code we can see the table is dropped in the MySql Environment. R Programming language has numerous libraries to create charts and graphs. A pie-chart is a representation of values as slices of a circle with different colors. The slices are labeled and the numbers corresponding to each slice is also represented in the chart. In R the pie chart is created using the pie() function which takes positive numbers as a vector input. The additional parameters are used to control labels, color, title etc. SyntaxThe basic syntax for creating a pie-chart using the R is − > install.packages("plotrix")84 Following is the description of the parameters used −
ExampleA very simple pie-chart is created using just the input vector and labels. The below script will create and save the pie chart in the current R working directory. > install.packages("plotrix")85 When we execute the above code, it produces the following result − Pie Chart Title and ColorsWe can expand the features of the chart by adding more parameters to the function. We will use parameter main to add a title to the chart and another parameter is col which will make use of rainbow colour pallet while drawing the chart. The length of the pallet should be same as the number of values we have for the chart. Hence we use length(x). ExampleThe below script will create and save the pie chart in the current R working directory. > install.packages("plotrix")86 When we execute the above code, it produces the following result − Slice Percentages and Chart LegendWe can add slice percentage and a chart legend by creating additional chart variables. > install.packages("plotrix")87 When we execute the above code, it produces the following result − 3D Pie ChartA pie chart with 3 dimensions can be drawn using additional packages. The package plotrix has a function called pie3D() that is used for this. > install.packages("plotrix")88 When we execute the above code, it produces the following result − A bar chart represents data in rectangular bars with length of the bar proportional to the value of the variable. R uses the function barplot() to create bar charts. R can draw both vertical and Horizontal bars in the bar chart. In bar chart each of the bars can be given different colors. SyntaxThe basic syntax to create a bar-chart in R is − > install.packages("plotrix")89 Following is the description of the parameters used −
ExampleA simple bar chart is created using just the input vector and the name of each bar. The below script will create and save the bar chart in the current R working directory. > install.packages("plotrix")90 When we execute above code, it produces following result − Bar Chart Labels, Title and ColorsThe features of the bar chart can be expanded by adding more parameters. The main parameter is used to add title. The col parameter is used to add colors to the bars. The args.name is a vector having same number of values as the input vector to describe the meaning of each bar. ExampleThe below script will create and save the bar chart in the current R working directory. > install.packages("plotrix")91 When we execute above code, it produces following result − Group Bar Chart and Stacked Bar ChartWe can create bar chart with groups of bars and stacks in each bar by using a matrix as input values. More than two variables are represented as a matrix which is used to create the group bar chart and stacked bar chart. > install.packages("plotrix")92 Boxplots are a measure of how well distributed is the data in a data set. It divides the data set into three quartiles. This graph represents the minimum, maximum, median, first quartile and third quartile in the data set. It is also useful in comparing the distribution of data across data sets by drawing boxplots for each of them. Boxplots are created in R by using the boxplot() function. SyntaxThe basic syntax to create a boxplot in R is − > install.packages("plotrix")93 Following is the description of the parameters used −
ExampleWe use the data set "mtcars" available in the R environment to create a basic boxplot. Let's look at the columns "mpg" and "cyl" in mtcars. > install.packages("plotrix")94 When we execute above code, it produces following result − > install.packages("plotrix")95 Creating the BoxplotThe below script will create a boxplot graph for the relation between mpg (miles per gallon) and cyl (number of cylinders). > install.packages("plotrix")96 When we execute the above code, it produces the following result − Boxplot with NotchWe can draw boxplot with notch to find out how the medians of different data groups match with each other. The below script will create a boxplot graph with notch for each of the data group. > install.packages("plotrix")97 When we execute the above code, it produces the following result − A histogram represents the frequencies of values of a variable bucketed into ranges. Histogram is similar to bar chat but the difference is it groups the values into continuous ranges. Each bar in histogram represents the height of the number of values present in that range. R creates histogram using hist() function. This function takes a vector as an input and uses some more parameters to plot histograms. SyntaxThe basic syntax for creating a histogram using R is − > install.packages("plotrix")98 Following is the description of the parameters used −
ExampleA simple histogram is created using input vector, label, col and border parameters. The script given below will create and save the histogram in the current R working directory. > install.packages("plotrix")99 When we execute the above code, it produces the following result − Range of X and Y valuesTo specify the range of values allowed in X axis and Y axis, we can use the xlim and ylim parameters. The width of each of the bar can be decided by using breaks. $ R00 When we execute the above code, it produces the following result − A line chart is a graph that connects a series of points by drawing line segments between them. These points are ordered in one of their coordinate (usually the x-coordinate) value. Line charts are usually used in identifying the trends in data. The plot() function in R is used to create the line graph. SyntaxThe basic syntax to create a line chart in R is − $ R01 Following is the description of the parameters used −
ExampleA simple line chart is created using the input vector and the type parameter as "O". The below script will create and save a line chart in the current R working directory. $ R02 When we execute the above code, it produces the following result − Line Chart Title, Color and LabelsThe features of the line chart can be expanded by using additional parameters. We add color to the points and lines, give a title to the chart and add labels to the axes. Example$ R03 When we execute the above code, it produces the following result − Multiple Lines in a Line ChartMore than one line can be drawn on the same chart by using the lines()function. After the first line is plotted, the lines() function can use an additional vector as input to draw the second line in the chart, $ R04 When we execute the above code, it produces the following result − Scatterplots show many points plotted in the Cartesian plane. Each point represents the values of two variables. One variable is chosen in the horizontal axis and another in the vertical axis. The simple scatterplot is created using the plot() function. SyntaxThe basic syntax for creating scatterplot in R is − $ R05 Following is the description of the parameters used −
ExampleWe use the data set "mtcars" available in the R environment to create a basic scatterplot. Let's use the columns "wt" and "mpg" in mtcars. $ R06 When we execute the above code, it produces the following result − $ R07 Creating the ScatterplotThe below script will create a scatterplot graph for the relation between wt(weight) and mpg(miles per gallon). $ R08 When we execute the above code, it produces the following result − Scatterplot MatricesWhen we have more than two variables and we want to find the correlation between one variable versus the remaining ones we use scatterplot matrix. We use pairs() function to create matrices of scatterplots. SyntaxThe basic syntax for creating scatterplot matrices in R is − $ R09 Following is the description of the parameters used −
ExampleEach variable is paired up with each of the remaining variable. A scatterplot is plotted for each pair. $ R10 When the above code is executed we get the following output. Statistical analysis in R is performed by using many in-built functions. Most of these functions are part of the R base package. These functions take R vector as an input along with the arguments and give the result. The functions we are discussing in this chapter are mean, median and mode. MeanIt is calculated by taking the sum of the values and dividing with the number of values in a data series. The function mean() is used to calculate this in R. SyntaxThe basic syntax for calculating mean in R is − $ R11 Following is the description of the parameters used −
Example$ R12 When we execute the above code, it produces the following result − $ R13 Applying Trim OptionWhen trim parameter is supplied, the values in the vector get sorted and then the required numbers of observations are dropped from calculating the mean. When trim = 0.3, 3 values from each end will be dropped from the calculations to find mean. In this case the sorted vector is (−21, −5, 2, 3, 4.2, 7, 8, 12, 18, 54) and the values removed from the vector for calculating mean are (−21,−5,2) from left and (12,18,54) from right. $ R14 When we execute the above code, it produces the following result − $ R15 Applying NA OptionIf there are missing values, then the mean function returns NA. To drop the missing values from the calculation use na.rm = TRUE. which means remove the NA values. $ R16 When we execute the above code, it produces the following result − $ R17 MedianThe middle most value in a data series is called the median. The median() function is used in R to calculate this value. SyntaxThe basic syntax for calculating median in R is − $ R18 Following is the description of the parameters used −
Example$ R19 When we execute the above code, it produces the following result − $ R20 ModeThe mode is the value that has highest number of occurrences in a set of data. Unike mean and median, mode can have both numeric and character data. R does not have a standard in-built function to calculate mode. So we create a user function to calculate mode of a data set in R. This function takes the vector as input and gives the mode value as output. Example$ R21 When we execute the above code, it produces the following result − $ R22 Regression analysis is a very widely used statistical tool to establish a relationship model between two variables. One of these variable is called predictor variable whose value is gathered through experiments. The other variable is called response variable whose value is derived from the predictor variable. In Linear Regression these two variables are related through an equation, where exponent (power) of both these variables is 1. Mathematically a linear relationship represents a straight line when plotted as a graph. A non-linear relationship where the exponent of any variable is not equal to 1 creates a curve. The general mathematical equation for a linear regression is − $ R23 Following is the description of the parameters used −
Steps to Establish a RegressionA simple example of regression is predicting weight of a person when his height is known. To do this we need to have the relationship between height and weight of a person. The steps to create the relationship is −
Input DataBelow is the sample data representing the observations − $ R24 lm() FunctionThis function creates the relationship model between the predictor and the response variable. SyntaxThe basic syntax for lm() function in linear regression is − $ R25 Following is the description of the parameters used −
Create Relationship Model & get the Coefficients$ R26 When we execute the above code, it produces the following result − $ R27 Get the Summary of the Relationship$ R28 When we execute the above code, it produces the following result − $ R29 predict() FunctionSyntaxThe basic syntax for predict() in linear regression is − $ R30 Following is the description of the parameters used −
Predict the weight of new persons$ R31 When we execute the above code, it produces the following result − $ R32 Visualize the Regression Graphically$ R33 When we execute the above code, it produces the following result − Multiple regression is an extension of linear regression into relationship between more than two variables. In simple linear relation we have one predictor and one response variable, but in multiple regression we have more than one predictor variable and one response variable. The general mathematical equation for multiple regression is − $ R34 Following is the description of the parameters used −
We create the regression model using the lm() function in R. The model determines the value of the coefficients using the input data. Next we can predict the value of the response variable for a given set of predictor variables using these coefficients. lm() FunctionThis function creates the relationship model between the predictor and the response variable. SyntaxThe basic syntax for lm() function in multiple regression is − $ R35 Following is the description of the parameters used −
ExampleInput DataConsider the data set "mtcars" available in the R environment. It gives a comparison between different car models in terms of mileage per gallon (mpg), cylinder displacement("disp"), horse power("hp"), weight of the car("wt") and some more parameters. The goal of the model is to establish the relationship between "mpg" as a response variable with "disp","hp" and "wt" as predictor variables. We create a subset of these variables from the mtcars data set for this purpose. $ R36 When we execute the above code, it produces the following result − $ R37 Create Relationship Model & get the Coefficients$ R38 When we execute the above code, it produces the following result − $ R39 Create Equation for Regression ModelBased on the above intercept and coefficient values, we create the mathematical equation. $ R40 Apply Equation for predicting New ValuesWe can use the regression equation created above to predict the mileage when a new set of values for displacement, horse power and weight is provided. For a car with disp = 221, hp = 102 and wt = 2.91 the predicted mileage is − $ R41 The Logistic Regression is a regression model in which the response variable (dependent variable) has categorical values such as True/False or 0/1. It actually measures the probability of a binary response as the value of response variable based on the mathematical equation relating it with the predictor variables. The general mathematical equation for logistic regression is − $ R42 Following is the description of the parameters used −
The function used to create the regression model is the glm() function. SyntaxThe basic syntax for glm() function in logistic regression is − $ R43 Following is the description of the parameters used −
ExampleThe in-built data set "mtcars" describes different models of a car with their various engine specifications. In "mtcars" data set, the transmission mode (automatic or manual) is described by the column am which is a binary value (0 or 1). We can create a logistic regression model between the columns "am" and 3 other columns - hp, wt and cyl. $ R44 When we execute the above code, it produces the following result − $ R45 Create Regression ModelWe use the glm() function to create the regression model and get its summary for analysis. $ R46 When we execute the above code, it produces the following result − $ R47 ConclusionIn the summary as the p-value in the last column is more than 0.05 for the variables "cyl" and "hp", we consider them to be insignificant in contributing to the value of the variable "am". Only weight (wt) impacts the "am" value in this regression model. In a random collection of data from independent sources, it is generally observed that the distribution of data is normal. Which means, on plotting a graph with the value of the variable in the horizontal axis and the count of the values in the vertical axis we get a bell shape curve. The center of the curve represents the mean of the data set. In the graph, fifty percent of values lie to the left of the mean and the other fifty percent lie to the right of the graph. This is referred as normal distribution in statistics. R has four in built functions to generate normal distribution. They are described below. $ R48 Following is the description of the parameters used in above functions −
dnorm()This function gives height of the probability distribution at each point for a given mean and standard deviation. $ R49 When we execute the above code, it produces the following result − pnorm()This function gives the probability of a normally distributed random number to be less that the value of a given number. It is also called "Cumulative Distribution Function". $ R50 When we execute the above code, it produces the following result − qnorm()This function takes the probability value and gives a number whose cumulative value matches the probability value. $ R51 When we execute the above code, it produces the following result − rnorm()This function is used to generate random numbers whose distribution is normal. It takes the sample size as input and generates that many random numbers. We draw a histogram to show the distribution of the generated numbers. $ R52 When we execute the above code, it produces the following result − The binomial distribution model deals with finding the probability of success of an event which has only two possible outcomes in a series of experiments. For example, tossing of a coin always gives a head or a tail. The probability of finding exactly 3 heads in tossing a coin repeatedly for 10 times is estimated during the binomial distribution. R has four in-built functions to generate binomial distribution. They are described below. $ R53 Following is the description of the parameters used −
dbinom()This function gives the probability density distribution at each point. $ R54 When we execute the above code, it produces the following result − pbinom()This function gives the cumulative probability of an event. It is a single value representing the probability. $ R55 When we execute the above code, it produces the following result − $ R56 qbinom()This function takes the probability value and gives a number whose cumulative value matches the probability value. $ R57 When we execute the above code, it produces the following result − $ R58 rbinom()This function generates required number of random values of given probability from a given sample. $ R59 When we execute the above code, it produces the following result − $ R60 Poisson Regression involves regression models in which the response variable is in the form of counts and not fractional numbers. For example, the count of number of births or number of wins in a football match series. Also the values of the response variables follow a Poisson distribution. The general mathematical equation for Poisson regression is − $ R61 Following is the description of the parameters used −
The function used to create the Poisson regression model is the glm() function. SyntaxThe basic syntax for glm() function in Poisson regression is − $ R43 Following is the description of the parameters used in above functions −
ExampleWe have the in-built data set "warpbreaks" which describes the effect of wool type (A or B) and tension (low, medium or high) on the number of warp breaks per loom. Let's consider "breaks" as the response variable which is a count of number of breaks. The wool "type" and "tension" are taken as predictor variables. Input Data $ R63 When we execute the above code, it produces the following result − $ R64 Create Regression Model$ R65 When we execute the above code, it produces the following result − $ R66 In the summary we look for the p-value in the last column to be less than 0.05 to consider an impact of the predictor variable on the response variable. As seen the wooltype B having tension type M and H have impact on the count of breaks. We use Regression analysis to create models which describe the effect of variation in predictor variables on the response variable. Sometimes, if we have a categorical variable with values like Yes/No or Male/Female etc. The simple regression analysis gives multiple results for each value of the categorical variable. In such scenario, we can study the effect of the categorical variable by using it along with the predictor variable and comparing the regression lines for each level of the categorical variable. Such an analysis is termed as Analysis of Covariance also called as ANCOVA. ExampleConsider the R built in data set mtcars. In it we observer that the field "am" represents the type of transmission (auto or manual). It is a categorical variable with values 0 and 1. The miles per gallon value(mpg) of a car can also depend on it besides the value of horse power("hp"). We study the effect of the value of "am" on the regression between "mpg" and "hp". It is done by using the aov() function followed by the anova() function to compare the multiple regressions. Input DataCreate a data frame containing the fields "mpg", "hp" and "am" from the data set mtcars. Here we take "mpg" as the response variable, "hp" as the predictor variable and "am" as the categorical variable. $ R67 When we execute the above code, it produces the following result − $ R68 ANCOVA AnalysisWe create a regression model taking "hp" as the predictor variable and "mpg" as the response variable taking into account the interaction between "am" and "hp". Model with interaction between categorical variable and predictor variable$ R69 When we execute the above code, it produces the following result − $ R70 This result shows that both horse power and transmission type has significant effect on miles per gallon as the p value in both cases is less than 0.05. But the interaction between these two variables is not significant as the p-value is more than 0.05. Model without interaction between categorical variable and predictor variable$ R71 When we execute the above code, it produces the following result − $ R72 This result shows that both horse power and transmission type has significant effect on miles per gallon as the p value in both cases is less than 0.05. Comparing Two ModelsNow we can compare the two models to conclude if the interaction of the variables is truly in-significant. For this we use the anova() function. $ R73 When we execute the above code, it produces the following result − $ R74 As the p-value is greater than 0.05 we conclude that the interaction between horse power and transmission type is not significant. So the mileage per gallon will depend in a similar manner on the horse power of the car in both auto and manual transmission mode. Time series is a series of data points in which each data point is associated with a timestamp. A simple example is the price of a stock in the stock market at different points of time on a given day. Another example is the amount of rainfall in a region at different months of the year. R language uses many functions to create, manipulate and plot the time series data. The data for the time series is stored in an R object called time-series object. It is also a R data object like a vector or data frame. The time series object is created by using the ts() function. SyntaxThe basic syntax for ts() function in time series analysis is − $ R75 Following is the description of the parameters used −
Except the parameter "data" all other parameters are optional. ExampleConsider the annual rainfall details at a place starting from January 2012. We create an R time series object for a period of 12 months and plot it. $ R76 When we execute the above code, it produces the following result and chart − $ R77 The Time series chart − Different Time IntervalsThe value of the frequency parameter in the ts() function decides the time intervals at which the data points are measured. A value of 12 indicates that the time series is for 12 months. Other values and its meaning is as below −
Multiple Time SeriesWe can plot multiple time series in one chart by combining both the series into a matrix. $ R78 When we execute the above code, it produces the following result and chart − $ R79 The Multiple Time series chart − When modeling real world data for regression analysis, we observe that it is rarely the case that the equation of the model is a linear equation giving a linear graph. Most of the time, the equation of the model of real world data involves mathematical functions of higher degree like an exponent of 3 or a sin function. In such a scenario, the plot of the model gives a curve rather than a line. The goal of both linear and non-linear regression is to adjust the values of the model's parameters to find the line or curve that comes closest to your data. On finding these values we will be able to estimate the response variable with good accuracy. In Least Square regression, we establish a regression model in which the sum of the squares of the vertical distances of different points from the regression curve is minimized. We generally start with a defined model and assume some values for the coefficients. We then apply the nls() function of R to get the more accurate values along with the confidence intervals. SyntaxThe basic syntax for creating a nonlinear least square test in R is − $ R80 Following is the description of the parameters used −
ExampleWe will consider a nonlinear model with assumption of initial values of its coefficients. Next we will see what is the confidence intervals of these assumed values so that we can judge how well these values fir into the model. So let's consider the below equation for this purpose − $ R81 Let's assume the initial coefficients to be 1 and 3 and fit these values into nls() function. $ R82 When we execute the above code, it produces the following result − $ R83 We can conclude that the value of b1 is more close to 1 while the value of b2 is more close to 2 and not 3. Decision tree is a graph to represent choices and their results in form of a tree. The nodes in the graph represent an event or choice and the edges of the graph represent the decision rules or conditions. It is mostly used in Machine Learning and Data Mining applications using R. Examples of use of decision tress is − predicting an email as spam or not spam, predicting of a tumor is cancerous or predicting a loan as a good or bad credit risk based on the factors in each of these. Generally, a model is created with observed data also called training data. Then a set of validation data is used to verify and improve the model. R has packages which are used to create and visualize decision trees. For new set of predictor variable, we use this model to arrive at a decision on the category (yes/No, spam/not spam) of the data. The R package "party" is used to create decision trees. Install R PackageUse the below command in R console to install the package. You also have to install the dependent packages if any. $ R84 The package "party" has the function ctree() which is used to create and analyze decison tree. SyntaxThe basic syntax for creating a decision tree in R is − $ R85 Following is the description of the parameters used −
Input DataWe will use the R in-built data set named readingSkills to create a decision tree. It describes the score of someone's readingSkills if we know the variables "age","shoesize","score" and whether the person is a native speaker or not. Here is the sample data. $ R86 When we execute the above code, it produces the following result and chart − $ R87 ExampleWe will use the ctree() function to create the decision tree and see its graph. $ R88 When we execute the above code, it produces the following result − $ R89 ConclusionFrom the decision tree shown above we can conclude that anyone whose readingSkills score is less than 38.3 and age is more than 6 is not a native Speaker. In the random forest approach, a large number of decision trees are created. Every observation is fed into every decision tree. The most common outcome for each observation is used as the final output. A new observation is fed into all the trees and taking a majority vote for each classification model. An error estimate is made for the cases which were not used while building the tree. That is called an OOB (Out-of-bag) error estimate which is mentioned as a percentage. The R package "randomForest" is used to create random forests. Install R PackageUse the below command in R console to install the package. You also have to install the dependent packages if any. $ R90 The package "randomForest" has the function randomForest() which is used to create and analyze random forests. SyntaxThe basic syntax for creating a random forest in R is − $ R91 Following is the description of the parameters used −
Input DataWe will use the R in-built data set named readingSkills to create a decision tree. It describes the score of someone's readingSkills if we know the variables "age","shoesize","score" and whether the person is a native speaker. Here is the sample data. $ R92 When we execute the above code, it produces the following result and chart − $ R87 ExampleWe will use the randomForest() function to create the decision tree and see it's graph. $ R94 When we execute the above code, it produces the following result − $ R95 ConclusionFrom the random forest shown above we can conclude that the shoesize and score are the important factors deciding if someone is a native speaker or not. Also the model has only 1% error which means we can predict with 99% accuracy. Survival analysis deals with predicting the time when a specific event is going to occur. It is also known as failure time analysis or analysis of time to death. For example predicting the number of days a person with cancer will survive or predicting the time when a mechanical system is going to fail. The R package named survival is used to carry out survival analysis. This package contains the function Surv() which takes the input data as a R formula and creates a survival object among the chosen variables for analysis. Then we use the function survfit() to create a plot for the analysis. Install Package$ R96 SyntaxThe basic syntax for creating survival analysis in R is − $ R97 Following is the description of the parameters used −
ExampleWe will consider the data set named "pbc" present in the survival packages installed above. It describes the survival data points about people affected with primary biliary cirrhosis (PBC) of the liver. Among the many columns present in the data set we are primarily concerned with the fields "time" and "status". Time represents the number of days between registration of the patient and earlier of the event between the patient receiving a liver transplant or death of the patient. $ R98 When we execute the above code, it produces the following result and chart − $ R99 From the above data we are considering time and status for our analysis. Applying Surv() and survfit() FunctionNow we proceed to apply the Surv() function to the above data set and create a plot that will show the trend. > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"00 When we execute the above code, it produces the following result and chart − > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"01 The trend in the above graph helps us predicting the probability of survival at the end of a certain number of days. Chi-Square test is a statistical method to determine if two categorical variables have a significant correlation between them. Both those variables should be from same population and they should be categorical like − Yes/No, Male/Female, Red/Green etc. For example, we can build a data set with observations on people's ice-cream buying pattern and try to correlate the gender of a person with the flavor of the ice-cream they prefer. If a correlation is found we can plan for appropriate stock of flavors by knowing the number of gender of people visiting. SyntaxThe function used for performing chi-Square test is chisq.test(). The basic syntax for creating a chi-square test in R is − > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"02 Following is the description of the parameters used −
ExampleWe will take the Cars93 data in the "MASS" library which represents the sales of different models of car in the year 1993. > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"03 When we execute the above code, it produces the following result − > myString <- "Hello, World!" > print ( myString) [1] "Hello, World!"04 The above result shows the dataset has many Factor variables which can be considered as categorical variables. For our model we will consider the variables "AirBags" and "Type". Here we aim to find out any significant correlation between the types of car sold and the type of Air bags it has. If correlation is observed we can estimate which types of cars can sell better with what types of air bags. What is the main difference between binomial and Poisson distributions?Binomial distribution is the one in which the number of outcomes are only two, that is success or failure. Example of binomial distribution: Coin toss. Poisson distribution: Poisson distribution is the one in which the number of possible outcomes has no limits.
What is the difference between binomial distribution and binomial distribution?The main difference between the binomial distribution and the normal distribution is that binomial distribution is discrete, whereas the normal distribution is continuous. It means that the binomial distribution has a finite amount of events, whereas the normal distribution has an infinite number of events.
What is the difference between Poisson process and Poisson distribution?The Poisson Process is the model we use for describing randomly occurring events and by itself, isn't that useful. We need the Poisson Distribution to do interesting things like finding the probability of a number of events in a time period or finding the probability of waiting some time until the next event.
What are the similarities between binomial distribution and Poisson distribution?The Binomial and Poisson distribution share the following similarities: Both distributions can be used to model the number of occurrences of some event. In both distributions, events are assumed to be independent.
|