{"id":273,"date":"2022-02-19T08:31:18","date_gmt":"2022-02-19T13:31:18","guid":{"rendered":"https:\/\/pressbooks.library.ryerson.ca\/criticaldataliteracy\/?post_type=chapter&#038;p=273"},"modified":"2022-02-27T15:00:57","modified_gmt":"2022-02-27T20:00:57","slug":"beware-of-bias","status":"publish","type":"chapter","link":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/chapter\/beware-of-bias\/","title":{"raw":"Beware of Bias","rendered":"Beware of Bias"},"content":{"raw":"Be aware that some data sources may be biased such as:\r\n<ul>\r\n \t<li style=\"font-weight: 400\">Organizations reporting on themselves<\/li>\r\n \t<li style=\"font-weight: 400\">Data that is generated by interest groups<\/li>\r\n \t<li style=\"font-weight: 400\">Data that is self-reported where they may be room for embellishment or incentives to inaccurately report (e.g. individuals reporting their own salary data)<\/li>\r\n<\/ul>\r\nReview the data sets you are using and make sure that it makes sense.\u00a0 Review how the data is collected and how terms are defined. Some knowledge and research on the topic will help. Consider: new sets of data against past years, data series that shows drastic changes should be investigated and understood before it is presented. It may not be the quality of the data that needs to be considered but how it is presented.\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\"><strong>Examples<\/strong><\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nData may not be biased exactly but may be socially constructed.\u00a0 For instance, here is a <a href=\"https:\/\/ontheline.github.io\/otl-racial-change\/index-caption.html\" target=\"_blank\" rel=\"noopener\">map showing racial change in Hartford, Connecticut from 1900-2018[footnote]\"Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 14.0 [Database]. Minneapolis, MN: IPUMS. 2019. DOI: http:\/\/doi.org\/10.18128\/D050.V14.0. Retrieved February 14, 2022 from https:\/\/ontheline.github.io\/otl-racial-change\/index-caption.html\"[\/footnote]<\/a>.\u00a0 Over time, definitions of race have changed and new terminology has emerged and become commonplace.\u00a0 In developing illustrations to visualize this data, you would want to be careful to acknowledge these changes.\u00a0 The explanation at the bottom of the graph helps to explain this as accurately as possible.\u00a0 There is not necessarily one correct way to display this data.\u00a0 When developing the visualization, clearly explain your choices and limitations.\r\n\r\n<\/div>\r\n<\/div>\r\n<h1>How to Recognize Bad Data<\/h1>\r\nAs much as possible, try to recognize bad data.\u00a0 The following could be red flags:\r\n<ul>\r\n \t<li style=\"font-weight: 400\"><strong>Empty\/blank cells:\u00a0<\/strong> Ask if the respondents did not answer this information or if it is simply incomplete.<\/li>\r\n \t<li style=\"font-weight: 400\"><strong>Data that doesn't make sense:\u00a0<\/strong> For instance, dates should be in a date format.\u00a0 Postal codes should be written as Letter\/Number\/Letter Space Number\/Letter\/Number.<\/li>\r\n<\/ul>\r\nMany open data sets come with source notes.\u00a0 Take the time to review the notes to understand how the data was collected and what it does (and doesn\u2019t) represent.","rendered":"<p>Be aware that some data sources may be biased such as:<\/p>\n<ul>\n<li style=\"font-weight: 400\">Organizations reporting on themselves<\/li>\n<li style=\"font-weight: 400\">Data that is generated by interest groups<\/li>\n<li style=\"font-weight: 400\">Data that is self-reported where they may be room for embellishment or incentives to inaccurately report (e.g. individuals reporting their own salary data)<\/li>\n<\/ul>\n<p>Review the data sets you are using and make sure that it makes sense.\u00a0 Review how the data is collected and how terms are defined. Some knowledge and research on the topic will help. Consider: new sets of data against past years, data series that shows drastic changes should be investigated and understood before it is presented. It may not be the quality of the data that needs to be considered but how it is presented.<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\"><strong>Examples<\/strong><\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>Data may not be biased exactly but may be socially constructed.\u00a0 For instance, here is a <a href=\"https:\/\/ontheline.github.io\/otl-racial-change\/index-caption.html\" target=\"_blank\" rel=\"noopener\">map showing racial change in Hartford, Connecticut from 1900-2018<a class=\"footnote\" title=\"&quot;Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 14.0 [Database]. Minneapolis, MN: IPUMS. 2019. DOI: http:\/\/doi.org\/10.18128\/D050.V14.0. Retrieved February 14, 2022 from https:\/\/ontheline.github.io\/otl-racial-change\/index-caption.html&quot;\" id=\"return-footnote-273-1\" href=\"#footnote-273-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a><\/a>.\u00a0 Over time, definitions of race have changed and new terminology has emerged and become commonplace.\u00a0 In developing illustrations to visualize this data, you would want to be careful to acknowledge these changes.\u00a0 The explanation at the bottom of the graph helps to explain this as accurately as possible.\u00a0 There is not necessarily one correct way to display this data.\u00a0 When developing the visualization, clearly explain your choices and limitations.<\/p>\n<\/div>\n<\/div>\n<h1>How to Recognize Bad Data<\/h1>\n<p>As much as possible, try to recognize bad data.\u00a0 The following could be red flags:<\/p>\n<ul>\n<li style=\"font-weight: 400\"><strong>Empty\/blank cells:\u00a0<\/strong> Ask if the respondents did not answer this information or if it is simply incomplete.<\/li>\n<li style=\"font-weight: 400\"><strong>Data that doesn&#8217;t make sense:\u00a0<\/strong> For instance, dates should be in a date format.\u00a0 Postal codes should be written as Letter\/Number\/Letter Space Number\/Letter\/Number.<\/li>\n<\/ul>\n<p>Many open data sets come with source notes.\u00a0 Take the time to review the notes to understand how the data was collected and what it does (and doesn\u2019t) represent.<\/p>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-273-1\">\"Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 14.0 [Database]. Minneapolis, MN: IPUMS. 2019. DOI: http:\/\/doi.org\/10.18128\/D050.V14.0. Retrieved February 14, 2022 from https:\/\/ontheline.github.io\/otl-racial-change\/index-caption.html\" <a href=\"#return-footnote-273-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":388,"menu_order":4,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-273","chapter","type-chapter","status-publish","hentry"],"part":247,"_links":{"self":[{"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/pressbooks\/v2\/chapters\/273","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/wp\/v2\/users\/388"}],"version-history":[{"count":7,"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/pressbooks\/v2\/chapters\/273\/revisions"}],"predecessor-version":[{"id":711,"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/pressbooks\/v2\/chapters\/273\/revisions\/711"}],"part":[{"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/pressbooks\/v2\/parts\/247"}],"metadata":[{"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/pressbooks\/v2\/chapters\/273\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/wp\/v2\/media?parent=273"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/pressbooks\/v2\/chapter-type?post=273"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/wp\/v2\/contributor?post=273"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.library.torontomu.ca\/criticaldataliteracy\/wp-json\/wp\/v2\/license?post=273"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}