BLOG 5

Primary sources are great for data collecting because there is so much of it in primary sources. Especially when looking for statistical data. Primary sources generally have most if not all data you need for a making a data set to tidy up with tidy dataset principles. First we wanna organize our our columns. In the tidy data set project i used a donation list from Lewis and Clark’s cross country exploration journey. The first column was made up of of Folio numbers, the page number the donation was compiled in in relation to the other donations in the group, the second column was made up of the date of which the item was donated. The third column was the donor or the person who gave the donation to the folio. The fourth column was the actual item of which was donated. Finally the last column was a description of the donation detailing what it is and the subject matter. The first column was chosen as the folio number because all the other categories can be follow it as the subject being recorded. The observation of the row starts with that first column. The folio number is the subject and the columns proceeding it are information about said folio number. The values are where the column and row intersect. The advantage of using a tidy dataset is having a standardized way of data being organized making it easier for people to read and understand your data. Wickham’s principle include, each variable forms a column, each observation forms a row, each type of observational unit forms a table. This makes data easy to read when it’s time to present and share the research. As long as the researcher knows what variables to include within the rows and columns of his primary dataset, the values should be relatively easy to read for any other researcher interested in looking into his data. Primary sources are the best to use for data collecting because they often have everything we need to make a data table about a subject we are looking at. A secondary source is used to help the primary source convey a point that the dataset is focused on but it wouldn’t include all the pieces that a primary source would have. The incorporation of a secondary source yields a more lengthy data table to support your research but using a secondary source exclusively wouldn’t provide enough data for a researcher. Secondary sources are great for explaining your primary source but can’t be used to make enough data to be useful in a data set.

Leave a comment

Your email address will not be published. Required fields are marked *