A research team planned to study Australian road transport crash fatalities from 2010 to 2018… 1 answer below »
Assignment TaskA research team planned to study Australian road transport crash fatalities from 2010 to 2018(inclusive). As a team member, you were given the dataset about Australian Road DeathFatalities (https://data.gov.au/data/dataset/australian-road-deaths-database ), and wererequested to analyse the data and prepare a report about your work and findings.The dataset can be downloaded from Blackboard or the above website. The dataset containsbasic demographic and crash details of Australian road crashes between 1989 and 2019. Asthe team does not have any specific goal for the analysis, you have the freedom to explore thedata, and dig out anything you feel interesting or significant. However, you are to limit yourresearch and analysis to the years 2010 to 2018.The potential audiences include other researchers, business representatives, and governmentagencies. They may have limited ICT or mathematical knowledge.To prepare the report, please include the following sections:1. IntroductionProvide an introduction to the problem. Include background material as appropriate: whocares about this problem, what impact it has, where does the data come from, what are thedimensions and structures of the data.2. Data SetupDescribe how to load the data, and how the pre-processing is performed.The original dataset is not ready for analysis and it is different from the data forms that weare familiar with in previous practices. This means we need to do some pre-processing, eitherfor the whole dataset, or for a subset of the dataset required for each sub task described later.Once you have some ideas of exploratory or advanced analysis, you need to adjust the formof dataset. This can be achieved either by manipulating records in R by transposition orsubsetting, or with other tools (e.g. notepad or excel) before reading them into R. Pleaseexplain your solution in this section.3. Exploratory Data Analysis3.1 One-variable analysisOne-variable analysis studies one variable (one row or one column) each time. For example,we can select a particular Australian state or year to get a column of numbers and thehistogram can be used.Perform 2 one-variable analyses. Plot one graph for each variable. Explain the finding foreach graph.3.2 Two-variable analysisICT110 Introduction to Data Science Assignment 2Page 4 of 5Two-variable analysis studies the relation between two variables. For example, we can select“Diseases of the nervous system” and “Year”, then a time series (scatter) plot can be drawn.Or, we can select “2015” and “Causes”.Perform 2 two-variable analysis. Plot one graph for each variable. Explain the finding foreach graph.4. Advanced Analysis4.1 ClusteringBriefly explain the concept of clustering and k-means.Perform 1 clustering analysis to group years according to a selected cause.4.2 Linear RegressionBriefly explain the concept of linear regression.Perform 2 linear regression analysis. Plot the learned models.5. Conclusion6. ReflectionsIn this part, discuss any difficulties you had performing the analysis and how you solvedthose difficulties. Reflect on how the analysis process went for you, what you learnt, andwhat you might do differently next time.For the data analysis (Section 3 & 4), you need to provide both R code, the explanation to thecode, and the result. Please represent each R code snippet in a box with some comments. Forexample:# Draw a boxplot on the attribute “Income”boxplot(MyData$income)The marking rubrics are viewable on the blackboard.Report FormatYour report should be no less than 1,200 words and it would be best to be no longer than2,000 words long. Text in R code snippets are not counted.The report MUST be formatted using the following guidelines:• Title Page – Must not contain headers, footers, or page numbering. Include your nameas the report’s author.• Header – Report title• Footer – your name and the page number• Paragraph text – 12 point Calibri single line spacing• Headings – Arial in an appropriate type size• Margins – 2.5cm on all margins• Page numbering• Executive summary to the last page of Table of Figures to use roman numerals(i, ii, iii, iv)