Sharpest manifestations of racism in the United States is its prison system.

One of the sharpest manifestations of racism in the United States is its prison system. A complex set of social and political structures, including the over-policing (Links to an external site.) of individuals of color and the war on drugs (Links to an external site.) have led to the disproportionate incarceration of people of color. These issues are very well summarized in the documentary 13th, which you can watch for free (Links to an external site.). In this assignment, you will use your data analysis and visualization skills to expose patterns of inequality using incarceration data collected by the Vera Institute (Links to an external site.). In doing so, you will: • Work with a real world, messy dataset • Choose your own direction for analysis, including which variables to analyze and compare • Design and build compelling visualizations to expose particular patterns in the data • Create a website (using RMarkdown) to share your results This assignment is more open-ended than previous ones -- you will be tasked with understanding the data itself, choosing the variables you want to analyze, and deciding the optimal way to write your code. Instructions As with previous assignments, follow this link (Links to an external site.) to create your own private repository for this assignment. This will automatically create a private repository which you will submit to Canvas as your assignment. You do not need to fork the repository (just clone it to your machine and begin working). Then, complete the steps outlined below by creating analysis.R and index.Rmd files. Data The data, which you will load from this GitHub repository (Links to an external site.)(specifically the incarceration_trends.csv file) is big and messy. A few hints on working with it: • Data Loading: It will take about a minute to load because it's so large. Don't worry if this takes a while when you start the assignment. Feel free to save a copy of the data (or a subset of it) in your directory • Read the documentation: Follow this link (Links to an external site.) to download the "Codebook" that explains the meaning of each variable. You'll need to understand what each variable represents, so read carefully! • Beware of missing values: You don't need to worry about "imputing" these, but when choosing what to visualize, you will want to focus on a particular location and/or year that has sufficient data (note, this varies across the variables) • Cross-reference their visual tool (Links to an external site.): For some steps, the mapping in particular, it may be helpful to check how much data is present / absent in the tool created by the organization. • Take your time: This data takes a while to get familiar with -- it took me a few hours to properly explore the data and understand its structure and meaning from the documentation. Be patient! Assignment structure For this assignment, you will create a report about incarceration in the U.S., which must include: • An introduction of the problem domain and a description of the variable(s) you are choosing to analyze (and why!) • A paragraph of summary information, citing at least 5 values calculated from the data • A chart that shows trends over time for a variable of your choice • A chart that compares two variables to one another • A map that shows how your measure of interest varies geographically See below for additional information for each component. Introduction + Summary Information As you introduce your report and the dataset, you should describe the variables that you've chosen to analyze. In doing so, make clear which measure(s) of incarceration you are focusing on. Then, you will share at least 5 relevant values of interest. These will likely be calculated using your DPLYR skills, answering questions such as: • What is the average value of my variable across all the counties (in the current year)? • Where is my variable the highest / lowest? • How much has my variable change over the last N years? Feel free to calculate and report values that you find relevant. Again, remember that the purpose is to think about how these measure of incarceration vary by race. Trends over time chart The first chart that you'll create and include will show the trend over time of your variable/topic. Think carefully about what you want to communicate to your user (you may have to find relevant trends in the dataset first!). Here are some requirements to help guide your design: • Show more than one, but fewer than ~10 lines: your graph should compare the trend of your measure over time. This may mean showing the same measure for different locations, or different racial groups. Think carefully about a meaningful comparison of locations (e.g., the top 10 counties in a state, top 10 states, etc.) • You must have clear x and y axis labels, • The chart needs a clear title • You need a legend for your different line colors, and a clear legend title • In your .Rmd file, make sure to describe why you included the chart, and what patterns emerged When I say "clear" or "human readable" titles and labels, that means that you should not just display the variable name. Variable comparison chart The second chart that you'll create and include will show how two different (continuous) variables are related to one another. Again, think carefully about what such a comparison means, and want to communicate to your user (you may have to find relevant trends in the dataset first!). Here are some requirements to help guide your design: • You must have clear x and y axis labels, • The chart needs a clear title • If you choose to add a color encoding (not required), you need a legend for your different color and a clear legend title • In your .Rmd file, make sure to describe why you included the chart, and what patterns emerged Map The last chart that you'll create and include will show how a variable is distributed geographically. Again, think carefully about what such a comparison means, and want to communicate to your user (you may have to find relevant trends in the dataset first!). Here are some requirements to help guide your design: • Your map needs a title • Your color scale needs a legend with a clear label • Use a map based coordinate system to set the aspect ratio of your map (see reading) • Use a minimalist theme for the map (see reading) • In your .Rmd file, make sure to describe why you included the chart, and what patterns emerged Feel free to get creative While the above layout is intended to set guidelines for our expectations, you should not feel restrained by this structure. For example, you can add additional encodings, use interactive charting packages in addition to ggplot2, layout multiple plots next to one another with the patchwork (Links to an external site.) package, and take this in various other directions! Submission Once you've finished editing your analysis.r and index.Rmd files, you'll need to compile your page to a website. Then, use git to add and commit the changes you've made, and push those changes to your repository on GitHub. Once you change the GitHub Pages permissions to publish your website, you should be able to see it on the web! Please submit the URL of your GitHub Repository as you assignment submission on Canvas. a3-incarceration a3-incarceration Criteria Ratings Pts This criterion is linked to a Learning OutcomeIntroduction + summary statistics Introduces the report and dataset, and provides at least 5 computed summary statistics from the data. 20 pts This criterion is linked to a Learning OutcomeTime Trend Chart Thoughtfully creates a chart of trends over time, describes *why* it was designed that way, and what patterns emerge. Chart meets requirements outlined above. 20 pts This criterion is linked to a Learning OutcomeVariable Comparison Chart Thoughtfully creates a chart comparing two variables, describes *why* it was designed that way, and what patterns emerge. Chart meets requirements outlined above. 20 pts This criterion is linked to a Learning OutcomeMap Thoughtfully creates a map of trends across the U.S., describes *why* it was designed that way, and what patterns emerge. Chart meets requirements outlined above. 20 pts This criterion is linked to a Learning OutcomeReport Quality and Completeness The report is successfully compiled into a hosted website. It doesn't include any unnecessary code, and the code (analysis.R) is well written and commented, lacking any linting errors. 20 pts Total Points: 100

IS IT YOUR FIRST TIME HERE? WELCOME

USE COUPON "11OFF" AND GET 11% OFF YOUR ORDERS