Genetic Databases and Biobanks
Risks to Consider When Sharing Our DNA
How much do you value your privacy? In today’s world, many people care about protecting their personal, financial and social privacy. Soon, however, it is likely that our society will have to consider protecting a whole new realm of information: our very own DNA. In recent years, there has been a rapid development of technology on all fronts, including in the field of genetics. This expansion has extended in a variety of directions: technologies such as CRISPR have allowed for novel genetic editing, while the emergence of commercial DNA testing sites has made ancestral information extremely accessible. In the wake of these breakthroughs, a variety of ethical questions and concerns have arisen as scientists try to navigate how this technology should be used and to what extent it can continue to develop without becoming morally questionable. One specific issue that has grown in prominence is the ethics of genetic databases or ‘Biobanks’ which have the potential to organize and store the genetic information of millions. As companies and governments begin to gather more and more genetic information about the world’s population, questions arise regarding how all of that information could be used. Due to a combination of the novelty of this new technology and the lack of regulation that surrounds such an important field of study, there has been a growing concern regarding the privacy of citizen’s DNA and other genetic information, as it could be severely misused or abused. On the other hand, the institution of genetic databases provides an extensive range of benefits in combating crime, analyzing population health, and propelling scientific research to new bounds. The debate regarding the morality and need for genetic databases is extremely divided, and poses a very important question to consider: Are the benefits of these Biobanks significant enough to outweigh the potential risks and, if so, how can we protect our most personal information?
What are Genetic Databases?
Genetic databases can include “molecular genetic data, standardized clinical data, genealogical data, and information on the health, lifestyle, and environment of individuals,” (Australian Law Reform Committee). Currently, genetic databases vary in both their forms and their applications. Database Commons is a catalog of global biological databases that provides open access to a comprehensive collection of publicly available databases and their information. As of January 2020, it cataloged a total of “4615 databases, involving more than 7000 publications and approximately 2000 organizations throughout the world,” (National Genomics Data Center Members and Partners, et al.). This number has continued to grow as more and more people have their DNA tested or analyzed and their genetic information stored. As of now, there is not a singular comprehensive database to account for all genetic information. Rather, there are a variety of smaller-scale databases, as private genealogy companies, government agencies, and scientists all have their separate collections.
The Emergence of Genetic Databases
Genetic databases are a relatively novel development, first appearing toward the end of the 20th century. One of the most comprehensive genetic databases that exist today is GenBank, which was created in 1979 at the Los Alamos National Laboratory in New Mexico and was originally called the Los Alamos Sequence Database (Choudhuri). It was renamed GenBank in 1982 and became a public database. The data contained in GenBank is submitted by the scientific community, mostly originating from genetic sequencing projects, and is used by other scientists throughout the world.
Another major source of genetic information can be found in commercially originated genetic databases. Genetic testing kits offered by companies like 23andMe and Ancestry have provided the general population with access to massive amounts of insight into their biological makeup. Genealogical DNA testing “first became available on a commercial basis in the year 2000 with the launch of Family Tree DNA and Oxford Ancestors,” (International Society of Genetic Genealogy). As the amount of genetic information and the sheer quantity of genetic databases have continued to increase, they have proven to have many useful applications.
What are the Applications?
The differing genetic databases that exist serve a variety of purposes depending on the database and who is accessing it. One major use is discovered when analyzing the medical applications. According to a publication by the Australian Law Reform Commission, some of the medical applications include “studies to identify the gene sequences associated with inherited diseases; association studies to find correlations between a disease and a genetic change; pharmacogenetic studies to determine if there is a genetic basis for certain adverse reactions to drugs” (ALRC). Genetic databases and biobanks provide scientists with a brand new scale of information, allowing them to complete research that could never be done before. The uses of biological databases extended far beyond the realms of science and medicine, though. Biobanks have proved to be extremely useful in the field of forensics, as it allows criminals to be tracked down from something as small as a piece of hair left at a crime scene. While genetic databases have an expansive range of very beneficial applications, a variety of potential risks and dangers accompany them simultaneously. However, when analyzing each of these issues, a variety of prospective solutions can be theorized.
Problem 1: Medical Misfires (Inaccuracies Within Databases)
While genetic databases can be extremely useful in medicine, especially in analyzing genetic variants, they have also proved to lead to potential consequences due to misinterpretations of the vast amount of information. According to the 2008 publication Genomic Data Resources: Challenges and Promises by Warren Lathe, Jennifer Williams, Mary Mangan, and Donna Karolchik[a], there is a rapid growth of information being added to genetic databases, which not only “makes it difficult to maintain accuracy and accessibility”, but also difficult to understand and navigate. Even though there are a variety of systems in place to attempt to organize the expanse of information, it can still be challenging for some researchers or doctors to utilize these resources effectively. Since genetic databases have been relied on to determine the significance of certain genetic variants while patients undergo diagnosis and treatment if databases are difficult to understand, potentially serious consequences can occur, including misdiagnosis of genetic-based diseases.
A tragic example of the potential risks that currently accompany the lack of organization that exists in genetic databases can be seen when analyzing the 2016 lawsuit Williams v. Athena, which occurred in South Carolina. The publication “Public variant databases: liability?” by Adrian Thorogood, Robert Cook-Deegan, and Bartha Maria Knoppers from Genetics in Medicine describes the case, which is centered around the death of Christian Jacob Millare, who died at age three in 2008 (Thorogood). In 2007, the Athena laboratory tested Christian’s DNA to see if he had a gene mutation that would indicate a severe and unique form of epilepsy. If it was found that he had the variant, he would have needed a unique treatment for his seizures. Based on their interpretation of information found in genetic databases, they misidentified Christian as not having the variant, and he failed to receive the unique treatment. Due to the lack of organization over the vastness of information present in genetic databases, mistakes like this present themselves as a major risk.
Solution: New Organization Techniques
The vast majority of problems associated with the negative effects of genetic database applications in the field of medicine originate from one key problem: the lack of organization. Up until very recently, genetic databases were primarily used by researchers in laboratories, who were capable of understanding the inherently complex nature of these biobanks. However, as genetic databases have shifted to have more universal and commercial uses, many are left confused and disoriented when attempting to navigate them. To resolve this issue, genetic databases must become more streamlined, easier to use, and clearer in the description of what genetic information is being presented. In regards to organization techniques, a variety of solutions have been proposed. One solution would consist of placing “a greater focus on the education of database biocurators in learning institutions and the standardized inclusion of sequence data and references in publications” (Lathe, Warren C., et al.). By encouraging more education on what genetic databases are and how to use them, a wider audience of people would be able to navigate them more successfully. Another potential solution is the idea of community contribution to genetic databases, which “envisions a sort of ‘wikification’ of data update and curation, in which research communities curate their databases themselves” (Lathe, Warren C., et al.). This method would ensure that researchers and organizations are being clear about what their data is, and all of the responsibility would fall on them to maintain accuracy. In this way, any sources of error could be held directly accountable. Neither of these solutions is “right”, necessarily. Rather, it is important to utilize both methods, as the lack of organization and misunderstanding would actively be resolving on both the researchers' end and the viewers' end.
Problem 2: Potential of a Surveillance State (Government Misuse)
Individual governments also have and use genetic databases. According to the International Society of Genetic Genealogy, “the first government database (the National DNA Database (NDNAD)) was set up by the United Kingdom in April 1995.” Today, a variety of countries have government-endorsed genetic databases, including the United States, which uses the Combined DNA Index System (CODIS) and the National DNA Index System (NDIS), primarily in criminal cases. Almost every developed country has some form of a genetic database. However, more expansive government-endorsed genetic databases have become controversial, as the true purposes of the extreme oversight are questioned by citizens.
The genetic database being created by the Chinese government could be considered as the most comprehensive government-endorsed biobank to date. The efforts of the government were first reported in 2017 and were initially described as an attempt to create a vast forensic database to aid in solving crime. According to the article “China’s massive effort to collect its people’s DNA concerns scientists” published in Nature by David Cyranoski, though, both researchers and Chinese citizens are concerned that this is an attempt to deepen social control and institute a more extensive surveillance system over the population. This has the potential to occur due to the types of genetic information that the government is collecting. China has been collecting Y-STR samples from men of all ages under the explanation that a majority of the crime that occurs in China is committed by men. While not all of the male population has had their genetic information collected, the Chinese government has access to practically the whole population’s information. This is because “ a Y-STR sample from an unknown man can be linked to all his male relatives on his paternal side” (Cyranoski), so in theory, China can create a genetic family tree that can connect a major portion of the population without a large amount of data. This has caused many concerns as China has increased population surveillance in other aspects of daily life as well.
Video: Gravitas: Is China creating a Global DNA Database?
While the potential of an extreme surveillance state is a risk that accompanies the institution of genetic databases, there are also concerns about potential damages to minority communities. Cyranoski elaborates on this, explaining that China “has also collected DNA from minority ethnic groups in Tibet and in the northwest province of Xinjiang, which has been criticized by human rights groups” (Cyranoski). There is current controversy over how the Chinese government has been treating the minority group of Uighur Muslims who live in Xinjiang, and reports of human rights abuses have surfaced. If countries can collect and store the genetic information of their citizens without regulation or other oversight, there is a potential for abuse and misuse, some examples of this being the institution of a surveillance state or minority discrimination.
Solution: International Conversations Regarding Regulations
Resolving problems regarding government misuse of genetic databases could prove to be a challenge, considering that it is difficult for one country to entirely hold other countries accountable for some wrongdoings. However, there have certainly been instances of international cooperation, as can be seen in the success of organizations like the United Nations. To address any potential issues in regards to government misuse of genetic databases, an international organization that regulates genetic technologies must be instituted. An organization that could fulfill this role currently exists; the International Bioethics Committee (IBC), of UNESCO. The IBC was created in 1993, and is described by the UNESCO website as “a body of 36 independent experts that follows progress in the life sciences and its applications in order to ensure respect for human dignity and freedom.” The IBC focuses on a variety of ethical and legal issues that arise as genetic technologies progress, but mainly on problems concerning the rights to the genome. To prevent governments from exploiting and misusing genetic databases, organizations like the IBC must expand their reach to cover these biobanks. It would be extremely beneficial for regulations regarding governmental use of genetic databases, as well as methods of monitoring these governments, to be instituted to protect citizens from having their genetic information used against them.
Problem 3: Exploitation of the Consumer (Law Enforcement Abuse)
While concerns about government use of citizen genetic data have been discussed, it is also important to consider the potential risks at the law enforcement level. There is currently a lack of adequate regulation on who can access information in genetic databases, and to what extent they can. As of now, many law enforcement agencies use genetic databases for forensic purposes. However, there has recently been a variety of cases in which law enforcement seek out and uses the information found in private or commercial biobanks to solve criminal cases. The consensus on whether citizens are okay with police accessing their commercially collected genetic data is very divided. According to the article “DNA Databases Are Boon to Police But Menace to Privacy, Critics Say” by Lindsey Van Ness, in a recent survey it was found that out of “more than 4,200 U.S. adults, 48% said they were OK with DNA testing companies sharing customers’ genetic data with police. A third said it was unacceptable, and 18% were unsure” (Van Ness). A key example of a situation that has caused this question to arise can be seen when analyzing the Golden State Killer case, in which a cold case was solved due to the information found in a consumer-based Biobank.
The Golden State Killer, who has now been identified as Joseph James DeAngelo Jr., committed several murders, rapes, and burglaries in California during the 1970s and 1980s. DeAngelo, a former police officer, managed to avoid being caught for decades, until 2018, when law enforcement decided to expand their investigation by utilizing the genetic database GEDmatch. While GEDmatch does not offer DNA tests, consumers have the opportunity to upload genetic information found from other companies’ tests, including 23andMe or Ancestry, to connect them to a wider range of people. The terms and conditions of the database had initially prohibited the distribution of the consumers’ genetic information for law enforcement purposes. However, according to the article “The Messy Consequences of the Golden State Killer Case” by Sarah Zhang, “At one point, the site secretly allowed police to upload DNA from the scene of a violent assault—following a personal appeal from the detective to one of GEDmatch’s co-founders” (Zhang), which exposed the identity of the Golden State Killer. This led to a major backlash from the users, who were extremely angered by the possibility that their previously private genetic information could now be used to incarcerate their relatives without their consent. This also managed to inflame the fears that people had regarding DNA databases and privacy concerns accompanying them, emphasizing the potential risk of abuse by law enforcement agencies.
Video: You Should Be Worried About Your DNA Privacy
Solution: Business Regulations
The most important solution to problems similar to those described above is the institution of new and comprehensive regulations on genealogy companies. Some current laws and regulations exist that focus on genetic information, but these often fall short. For example, Title II of the Genetic Information Nondiscrimination Act of 2008 “prevents group health and Medicare supplemental plans—but not life, disability, or long-term care plans—from using genetic information to discriminate against you when it comes to insurance,” (Lynch, Hussain), but is completely unrelated to privacy issues. While the Health Insurance Portability and Accountability Act (HIPAA) focuses a bit more on protecting privacy, it only applies to specific entities and businesses, and “many non-covered entities collect genetic information, such as online genetic testing companies like 23andMe and genealogy websites like Ancestry.com” (Lynch, Hussain). Hence, there is currently a severe lack of privacy protections or regulations in place. To protect the privacy of the consumers, genetic-testing and ancestry companies need to institute policies that assure the customers that their data will not be available to law enforcement or other sources that could take advantage of it. Along with this, to ensure that businesses hold to these policies, governments should place more comprehensive regulations on these companies.
The Importance of Considering the Risks as We Move Forward
The novel genetic technologies that have been created have propelled science forward in an indescribable amount of ways and have had an immensely positive impact on the research and medical community. While there are certainly many benefits to genetic databases, it is crucial to consider the potential dangers and risks that could accompany them, especially while the technology is still relatively new. Scientists and government officials need to attempt to get ahead of these problems, otherwise, the risk of medical mistakes, discrimination, and severe privacy violations will always be looming over our society. There is a severe lack of organization that exists in the sea of data present in biobanks currently. There is also a lack of comprehensive regulations describing the ethical limits of genetic databases, and the inappropriate use of genetic databases can create very specific and dangerous risks, considering that they can be used to identify individuals even if their genetic information is not present in the biobanks. By centering in on occasions that emphasize the negative aspects of genetic databases, we can identify which problems need to be resolved. These issues need to be addressed, as there will likely not be a cease in the rapid growth of genetic technologies, and will only be able to be resolved under comprehensive and expansive organization, simplification, and comprehensive regulation on both an international and national scale.
Works Cited
Choudhuri, Supratim. “Data, Databases, Data Format, Database Search, Data Retrieval Systems, and Genome Browsers.” Bioinformatics for Beginners, Academic Press, 16 May 2014, www.sciencedirect.com/science/article/pii/B9780124104716000050.
Cyranoski, David. “China's Massive Effort to Collect Its People's DNA Concerns Scientists.” Nature News, Nature Publishing Group, 7 July 2020, www.nature.com/articles/d41586-020-01984-4.
“Genetic Genealogy.” Genetic Genealogy - ISOGG Wiki, International Society of Genetic Genealogy, isogg.org/wiki/Genetic_genealogy#:~:text=Genetic%20genealogy%20involves%20the%20use,Tree%20DNA%20and%20Oxford%20Ancestors.
“International Bioethics Committee (IBC).” UNESCO, UNESCO, 17 Mar. 2021, en.unesco.org/themes/ethics-science-and-technology/ibc.
Lathe, Warren C., et al. Nature News, Nature Publishing Group, 2008, www.nature.com/scitable/topicpage/genomic-data-resources-challenges-and-promises-743721/.
Lynch, Jennifer, and Saira Hussain. “Genetic Information Privacy.” Electronic Frontier Foundation, Electronic Frontier Foundation, www.eff.org/issues/genetic-information-privacy.
National Genomics Data Center Members and Partners, et al. “Database Resources of the National Genomics Data Center in 2020.” OUP Academic, Oxford University Press, 8 Nov. 2019, academic.oup.com/nar/article/48/D1/D24/5614641.
Thorogood, Adrian, et al. “Public Variant Databases: Liability?” Nature News, Nature Publishing Group, 15 Dec. 2016, www.nature.com/articles/gim2016189.
Van Ness, Lindsey. “DNA Databases Are Boon to Police But Menace to Privacy, Critics Say.” DNA Databases Are Boon to Police But Menace to Privacy Critics Say | The Pew Charitable Trusts, The PEW Charitable Trusts, 20 Feb. 2020, www.pewtrusts.org/en/research-and-analysis/blogs/stateline/2020/02/20/dna-databases-are-boon-to-police-but-menace-to-privacy-critics-say.
“What Are Human Genetic Research Databases?” ALRC, Australian Law Reform Committee, www.alrc.gov.au/publication/essentially-yours-the-protection-of-human-genetic-information-in-australia-alrc-report-96/18-human-genetic-research-databases/what-are-human-genetic-research-databases/.
Zhang, Sarah. “The Messy Consequences of the Golden State Killer Case.” The Atlantic, Atlantic Media Company, 2 Oct. 2019, www.theatlantic.com/science/archive/2019/10/genetic-genealogy-dna-database-criminal-investigations/599005/.
[a]Idk how your professor wants it but in research Mr Scudder told us not to list more than two of the authors and us et al