It’s Murphy’s Law of Data: The data you have isn’t always in the format that you need. And not all problems have to do with mistakes or gaps in the data. Sometimes you’ve got wide data that needs to be long; or long data that needs to be wide.
Let’s work on an example. Here, I’ll read in a spreadsheet of home prices in 5 U.S. metro areas: Boston, Detroit, Philadelphia, San Francisco, and San Jose (which I’m calling Silicon Valley). More specifically, data about home prices every two years, when all cities started with an index of 100 in 1995. This data runs from 2000 to 2018.
[ Get Sharon Machlis’s R tips in our how-to video series. | Read the InfoWorld tutorials: Learn to crunch big data with R. • How to reshape data in R. • R data manipulation tricks at your fingertips • Beginner’s guide to R. | Stay up to date on analytics and big data with the InfoWorld Big Data Report newsletter. ]
Here’s a look at the spreadsheet:
To read this article in full, please click here