What You'll Learn
> To create variables to easily store data values, or vectors/lists of values, or datasets, or objects in R.
> To be able to assign values or objects to variables, understand naming style conventions, and override variable values.
If you haven’t installed R and Rstudio already, you can watch "Getting started with Python and R for Data Science" video to get started.
For the dataset used in this exercise, download from here.
A variable is a way for us to easily store data values or a vector or list of values or a data set or object in R, and allows us to conveniently reference that variable name, saving us from having to rewrite the data value, vector, or object used many times around a program.
We’ll cover vectors, main R objects, and reading in datasets later on in the video series but we first need to understand how to set up a variable so that we can write these things to a variable name. When we wrote our first “hello world” program, you might recall, we tied our data value to a variable name called “hello.string”. So, let’s revisit this.
We have a variable called “hello.string” and we use this assignment operator to tie whatever we want to it such as a data value of some sort Now our variable name should be almost self-explanatory of what our variable is storing or referring to. I wouldn’t call a variable to say “cat” and give it the value “dog”, for example. It’s best just to give it a name that kind of makes sense.
Now, you might be wondering, if there are naming or style conventions for variables in R. So, in R we usually separate our one-word descriptions in our names with a period. For example, this could be “animals.list”. We can also include numbers in our variable names. So, if you have multiple data sets you might want to call your variable “dataset1” or if you have multiple models, for example, you might want to call it “model1”.
R is case-sensitive too. So, if you were to capitalize “Model1”, it will not register as we gave our variable the name “model1” all in lowercase. You can use other naming conventions as well, such as underscore, or you could use, say, camel case also.
A variable can only really consist of letters, numbers, periods, underscores, and they must begin with a letter or a period followed by a letter, not a number. For example, you cannot have a variable called “2pairs”. This won’t work. The same goes for a period followed by a number. You can, however, have two pairs as in the letters “two.pairs” and the same goes for a period. Just keep in mind, you also cannot use spaces when you’re naming your variables.
Lastly, you can override what a variable stores by using the same variable name but tie it to something else. For example, “animal”: “animal <- ‘cat’". We’re going to override this now with a different animal which is “dog” so we can see the animal once belonged to you had the value “cat” but we have updated it with a new value “dog” using the same variable name. Just note that once you override it, you permanently changed its value, so you would either have to rewrite it to “cat” again or give it another variable name if you would still like to use the “cat” value.
Now that we’ve set up a variable we can easily refer to the variable name throughout a program so we don’t need to keep typing out our data values or vectors or objects.
In the next video, we’ll cover different operators.
Rebecca Merrett - Rebecca holds a bachelor’s degree of information and media from the University of Technology Sydney and a post graduate diploma in mathematics and statistics from the University of Southern Queensland. She has a background in technical writing for games dev and has written for tech publications.
© Copyright – Data Science Dojo